Page 219 - Compact Numerical Methods For Computers
P. 219

Compact numerical methods for computers
                            208
                                                    17.2. TWO METHODS
                            Almost immediately two possible routes to minimising S(x) suggest themselves.
                            The Cauchy steepest descents method
                                             u
                            Find the gradient 2u(x) of S(x) and step downhill along it. (The reason for the
                            factor 2 will become apparent shortly.) Suppose that t represents the step length
                            along the gradient, then for some t we have
                                                         S( x- tv) <S(x)                       (17.6)
                            except at a local minimum or a saddle point. The steepest descents method
                            replaces x by (x-tv) and repeats the process from the new point. The iteration is
                            continued until a t cannot be found for which (17.6) is satisfied. The method,
                            which was suggested by Cauchy (1848), is then taken to have converged. It can be
                            shown always to converge if S(x) is convex, that is, if
                                               S (cx +(1-c)x )<cS(x )+(1-c)S(x )               (17.7)
                                                                   1
                                                                               2
                                                    1
                                                            2
                            for 0<c<1. Even for non-convex functions which are bounded from below, the
                            steepest descents method will find a local minimum or saddle point. All the
                            preceding results are, of course, subject to the provision that the function and
                            gradient are computed exactly (an almost impossible requirement). In practice,
                            however, convergence is so slow as to disqualify the method of steepest descents
                            on its own as a candidate for minimising functions.
                              Often the cause of this slowness of convergence is the tendency of the method
                            to take pairs of steps which are virtually opposites, and which are both essentially
                            perpendicular to the direction in which the minimum is to be found. In a
                            two-parameter example we may think of a narrow valley with the minimum
                            somewhere along its length. Suppose our starting point is somewhere on the side
                            of the valley but not near this minimum. The gradient will be such that the
                            direction of steepest descent is towards the floor of the valley. However, the step
                            taken can easily traverse the valley. The situation is then similar to the original
                            one, and it is possible to step back across the valley almost to our starting point
                            with only a very slight motion along the valley toward the solution point. One can
                            picture the process as following a path similar to that which would be followed by
                            a marble or ball-bearing rolling over the valley-shaped surface.
                              To illustrate the slow convergence, a modified version of steepest descents was
                            programmed in BASIC on a Data General NOVA minicomputer having machine
                                      -22
                            precision 2 . The modification consisted of step doubling if a step is successful.
                            The step length is divided by 4 if a step is unsuccessful. This reduction in step size
                            is repeated until either a smaller sum of squares is found or the step is so small
                            that none of the parameters change. As a test problem, consider the Rosenbrock
                            banana-shaped valley:



                            starting with
                                                       S(-l·2,1)=24·1999
   214   215   216   217   218   219   220   221   222   223   224