Page 221 - Compact Numerical Methods For Computers
P. 221

210               Compact numerical methods for computers
                            for each j=1, 2, . . . , n. From (17.10) and (17.12), therefore, we have

                                                                                              (17.15)


                              To apply the Newton-Raphson iteration (defined by equation (15.9)) to the
                            solution of (17.8) thus requires the second derivatives of f with respect to the
                                                   2
                            parameters, a total of mn  computations. On the grounds that near the minimum
                            the functions should be ‘small’ (a very cavalier assumption), the second term in
                            the summation (17.15) is neglected, so that the partial derivatives of v are
                            approximated by the matrix J J, reducing (17.14) to
                                                      T
                                                         T          T
                                                                                              (17.16)
                                                        J Jq =- v =-J f .
                              These are simply normal equations for a linearised least-squares problem
                            evaluated locally, and the Gauss-Newton iteration proceeds by replacing x by
                            (x+q) and repeating until either q is smaller than some predetermined tolerance
                            or
                                                           S(x+q )>S(x).                      (17.17)


                                              17.3. HARTLEY’S MODIFICATION
                              As it has been stated, the Gauss-Newton iteration can quite easily fail since the
                            second term in the summation (17.15) is often not negligible, especially in the
                            case when the residuals f are large. However, if instead of replacing x by (x+q)
                            we use (x+tq), where t is a step-length parameter, it is possible to show that if J
                            is of full rank (J J is positive definite and hence non-singular), then this modified
                                            T
                            Gauss-Newton algorithm always proceeds towards a minimum. For in this case, we
                            have
                                                                                              (17.18)
                                                               T   - 1 T
                                                         q = - (J J) J f
                            so that the inner product between q and the direction of the steepest descent
                              T
                            -J f is
                                                       T    T
                                                     - q v =f J(J J) J f >0                  (17.19)
                                                                   -1 T
                                                                T
                                     -1
                            since (J J)  is positive definite by virtue of the positive definiteness of J J. Thus
                                                                                            T
                                  T
                            the cosine of the angle between q and -v is always positive, guaranteeing that the
                            search is downhill. In practice, however, angles very close to 90° are observed,
                            which may imply slow convergence.
                              The modified Gauss-Newton procedure above has been widely used to mini-
                            mise sum-of-squares functions (Hartley 1961). Usually the one-dimensional search
                            at each iteration is accomplished by fitting a quadratic polynomial to the sum-of-
                            squares surface along the search direction and evaluating the function at the
                            estimated minimum of this quadratic. This requires at least three sum-of-squares
                            function evaluations per iteration and, furthermore, the apparent simplicity of this
                            parabolic inverse interpolation to accomplish a linear search when stated in words
                            hides a variety of strategems which must be incorporated to prevent moves either
                            in the wrong direction or to points where S(x+q)>S(x). When the function is
   216   217   218   219   220   221   222   223   224   225   226