Page 197 - Compact Numerical Methods For Computers
P. 197
Chapter 15
DESCENT TO A MINIMUM I: VARIABLE
METRIC ALGORITHMS
15.1. DESCENT METHODS FOR MINIMISATION
The Nelder-Mead algorithm is a direct search procedure in that it involves only
function values. In the next few sections, methods will be considered which make
use of the gradient of the function S(b) which will be called g:
(15.1)
for j = 1, 2, . . . , n
evaluated at the point b. Descent methods all use the basic iterative step
b' =b- k B g (15.2)
where B is a matrix defining a transformation of the gradient and k is a step
length. The simplest such algorithm, the method of steepest descents, was proposed
by Cauchy (1848) for the solution of systems of nonlinear equations. This uses
B=1 n (15.3)
and any step length k which reduces the function so that
S( b')<S(b). (15.4)
The principal difficulty with steepest descents is its tendency to hemstitch, that is,
to criss-cross a valley on the function S(b) instead of following the floor of the
valley to a minimum. Kowalik and Osborne (1968, pp 34-9) discuss some of the
reasons for this weakness, which is primarily that the search directions generated
are not linearly independent. Thus a number of methods have been developed
which aim to transform the gradient g so that the search directions generated in
(15.2) are linearly independent or, equivalently, are conjugate to each other with
respect to some positive definite matrix A. In other words, if x i and x j are search
directions, x i and x are conjugate with respect to the positive definite matrix A if
j
(15.5)
The conjugate gradients algorithm 22 generates such a set of search directions
implicitly, avoiding the storage requirements of either a transformation matrix B
or the previous search directions. The variable metric algorithm 21 uses a
transformation matrix which is adjusted at each step to generate appropriate
search directions. There is, however, another way to think of this process.
Consider the set of nonlinear equations formed by the gradient at a minimum
g (b')=0. (15.6)
186