Page 200 - Compact Numerical Methods For Computers
P. 200
Descent to a minimum I: variable metric algorithms 189
is given by
y = g j + l - g =H(b j+ l -b )
j
j
j
=k Ht j (15.18)
j
since the elements of H are constant in this case. From this it follows that (15.15)
becomes
B (m ) y =k t . (15.19)
j j
j
Assuming (15.15) is correct for j<m, a new step
(m )
t =B g (15.20)
m m
is required to be conjugate to all previous directions t , i.e.
j
for j<m. (15.21)
But from (15.18), (15.19) and (15.20) we obtain
(15.22)
Suppose now that the linear searches at each step have been performed
accurately. Then the directions
t , t , . . . , t m- l (15.23)
2
1
define a hyperplane on which the quadratic form has been minimised. Thus g m is
orthogonal to t , j<m, and (15.21) is satisfied, so that the new direction t m is
j
conjugate to the previous ones.
In order that B (m +l) satisfies the above theory so that the process can continue,
the update C in
B (m + 1 ) =B ( m ) +C (15.24)
must satisfy (15.19), that is
(m + l )
B y =k t (15.25)
m m
m
or
Cy =k t -B ( m ) y m (15.26)
m
m m
and
B (m + l ) y =k t f o r j<m (15.27)
j j
j
or
(m )
Cy =k t -B y
j j j j
= k t -k t =0. (15.28)
j j
j j
( m )
In establishing (15.26) and (15.28), equation (15.19) has been applied for B .
There are thus m conditions on the order-n matrix C. This degree of choice in C
has in part been responsible for the large literature on variable metric methods.
The essence of the variable metric methods, i.e. that information regarding
the Hessian has been drawn from first derivative computations only is somewhat
hidden in the above development. Of course, differences could have been used to
generate an approximate Hessian from (n+1) vectors of first derivatives, but a