Page 221 - Compact Numerical Methods For Computers
P. 221
210 Compact numerical methods for computers
for each j=1, 2, . . . , n. From (17.10) and (17.12), therefore, we have
(17.15)
To apply the Newton-Raphson iteration (defined by equation (15.9)) to the
solution of (17.8) thus requires the second derivatives of f with respect to the
2
parameters, a total of mn computations. On the grounds that near the minimum
the functions should be ‘small’ (a very cavalier assumption), the second term in
the summation (17.15) is neglected, so that the partial derivatives of v are
approximated by the matrix J J, reducing (17.14) to
T
T T
(17.16)
J Jq =- v =-J f .
These are simply normal equations for a linearised least-squares problem
evaluated locally, and the Gauss-Newton iteration proceeds by replacing x by
(x+q) and repeating until either q is smaller than some predetermined tolerance
or
S(x+q )>S(x). (17.17)
17.3. HARTLEY’S MODIFICATION
As it has been stated, the Gauss-Newton iteration can quite easily fail since the
second term in the summation (17.15) is often not negligible, especially in the
case when the residuals f are large. However, if instead of replacing x by (x+q)
we use (x+tq), where t is a step-length parameter, it is possible to show that if J
is of full rank (J J is positive definite and hence non-singular), then this modified
T
Gauss-Newton algorithm always proceeds towards a minimum. For in this case, we
have
(17.18)
T - 1 T
q = - (J J) J f
so that the inner product between q and the direction of the steepest descent
T
-J f is
T T
- q v =f J(J J) J f >0 (17.19)
-1 T
T
-1
since (J J) is positive definite by virtue of the positive definiteness of J J. Thus
T
T
the cosine of the angle between q and -v is always positive, guaranteeing that the
search is downhill. In practice, however, angles very close to 90° are observed,
which may imply slow convergence.
The modified Gauss-Newton procedure above has been widely used to mini-
mise sum-of-squares functions (Hartley 1961). Usually the one-dimensional search
at each iteration is accomplished by fitting a quadratic polynomial to the sum-of-
squares surface along the search direction and evaluating the function at the
estimated minimum of this quadratic. This requires at least three sum-of-squares
function evaluations per iteration and, furthermore, the apparent simplicity of this
parabolic inverse interpolation to accomplish a linear search when stated in words
hides a variety of strategems which must be incorporated to prevent moves either
in the wrong direction or to points where S(x+q)>S(x). When the function is