Page 199 - Compact Numerical Methods For Computers
P. 199
188 Compact numerical methods for computers
parameters by means of a sequence of steps
b ' =b - k Bg (15.2)
where g is the gradient of S. The definition of an algorithm of this type consists in
specifying (a) how the matrix B is computed, and (b) how k is chosen. The second
of these choices is the linear search problem of §13.2. In practice the algorithm
suggested there is unnecessarily rigorous and a simpler search will be used. The
rest of this section will be devoted to the task of deciding appropriate conditions
on the matrix B.
Firstly, when it works, Newton’s method generally converges very rapidly. Thus
it would seem desirable that B tend in some way towards the inverse Hessian
matrix H . However, the computational requirement in Newton’s method for
-l
second partial derivatives and for the solution of linear-equation systems at each
iteration must be avoided.
Secondly, B should be positive definite, not only to avoid the possibility that the
algorithm will ‘get stuck’ because Bg lacks components necessary to reduce the
function because they are in the null space of B but also to permit the algorithm
to exhibit quadratic termination, that is, to minimise a quadratic form in at most n
steps (15.2). A quadratic form
(15.11)
has a unique minimum if H is positive definite. Note that H can always be made
symmetric, and will be presumed so here. The merits of quadratic termination and
positive definiteness are still being debated (see, for instance, Broyden 1972,
p 94). Also, a function with a saddle point or ridge has a Hessian which is not
always positive definite. Nevertheless, the discussion here will be confined to
methods which generate positive definite iteration matrices.
If n linearly independent directions t , j=1, 2, . . . , n, exist for which
j
BHt =t (15.12)
j j
then
BH =l n or B=H . (15.13)
- l
The search directions t are sought conjugate to H, leading in an implicit fashion
j
to the expansion
(15.14)
Since B is to be developed as a sequence of matrices B ( m ) , the condition (15.12)
will be stated
B Ht =t j (15.15)
(m)
j
Thus we have
B (n +1) = H . (15.16)
-1
For a quadratic form, the change in the gradient at b , that is
j
g =Hb - c (15.17)
j j