Page 199 - Compact Numerical Methods For Computers
P. 199

188               Compact numerical methods for computers

                            parameters by means of a sequence of steps
                                                           b ' =b - k Bg                      (15.2)
                            where g is the gradient of S. The definition of an algorithm of this type consists in
                            specifying (a) how the matrix B is computed, and (b) how k is chosen. The second
                            of these choices is the linear search problem of §13.2. In practice the algorithm
                            suggested there is unnecessarily rigorous and a simpler search will be used. The
                            rest of this section will be devoted to the task of deciding appropriate conditions
                            on the matrix B.
                              Firstly, when it works, Newton’s method generally converges very rapidly. Thus
                            it would seem desirable that B tend in some way towards the inverse Hessian
                            matrix H . However, the computational requirement in Newton’s method for
                                    -l
                            second partial derivatives and for the solution of linear-equation systems at each
                            iteration must be avoided.
                              Secondly, B should be positive definite, not only to avoid the possibility that the
                            algorithm will ‘get stuck’ because Bg lacks components necessary to reduce the
                            function because they are in the null space of B but also to permit the algorithm
                            to exhibit quadratic termination, that is, to minimise a quadratic form in at most n
                            steps (15.2). A quadratic form

                                                                                             (15.11)
                            has a unique minimum if H is positive definite. Note that H can always be made
                            symmetric, and will be presumed so here. The merits of quadratic termination and
                            positive definiteness are still being debated (see, for instance, Broyden 1972,
                            p 94). Also, a function with a saddle point or ridge has a Hessian which is not
                            always positive definite. Nevertheless, the discussion here will be confined to
                            methods which generate positive definite iteration matrices.
                              If n linearly independent directions t ,  j=1, 2, . . . , n, exist for which
                                                           j
                                                            BHt =t                           (15.12)
                                                                j  j
                            then
                                                  BH =l n      or     B=H .                  (15.13)
                                                                           - l
                            The search directions t  are sought conjugate to H, leading in an implicit fashion
                                                j
                            to the expansion
                                                                                             (15.14)

                            Since B is to be developed as a sequence of matrices B  ( m ) , the condition (15.12)
                            will be stated
                                                           B Ht =t  j                        (15.15)
                                                            (m)
                                                                 j
                            Thus we have
                                                          B  (n +1) = H .                    (15.16)
                                                                   -1
                            For a quadratic form, the change in the gradient at b , that is
                                                                           j
                                                           g =Hb - c                         (15.17)
                                                            j    j
   194   195   196   197   198   199   200   201   202   203   204