Page 153 - Compact Numerical Methods For Computers

P. 153

Chapter 12

OPTIMISATION AND NONLINEAR EQUATIONS

12.1. FORMAL PROBLEMS IN UNCONSTRAINED
OPTIMISATION AND NONLINEAR EQUATIONS
The material which follows in the next few chapters deals with finding the minima
of functions or the roots of equations. The functions or equations will in general
be nonlinear in the parameters, that is following Kowalik and Osborne (1968),
the problems they generate will not be solvable by means of linear equations
(though, as we shall see, iterative methods based on linear subproblems are very
important).
A special case of the minimisation problem is the nonlinear least-squares
problem which, because of its practical importance, will be presented first. This
can be stated: given M nonlinear functions of n parameters
f (b ,b , . . . ,b ) i = 1, 2, . . . , M (12.1)
i
n
1
2
minimise the sum of squares
(12.2)
It is convenient to collect the n parameters into a vector b; likewise the functions
can be collected as the vector of M elements f. The nonlinear least-squares
problem commonly, but not exclusively, arises in the fitting of equations to data
by the adjustment of parameters. The data, in the form of K variables (where
K = 0 when there are no data), may be thought of as occupying a matrix Y of
which the j th column y gives the value of variable j at each of the M data points.
j
The terms parameters, variables and data points as used here should be noted,
since they lead naturally to the least-squares problem with
f(b Y) = g(b, y , y , . . . , y, l 2 K -1 ) – y K (12.3)
in which it is hoped to fit the function(s) g to the variable y K by adjusting the
parameters b. This will reduce the size of the functions f, and hence reduce the
sum of squares S. Since the functions f in equation (12.3) are formed as
differences, they will be termed residuals. Note that the sign of these is the
opposite of that usually chosen. The sum of squares is the same of course. My
preference for the form used here stems from the fact that the partial derivatives
of f with respect to the parameters b are identical to those of g with respect to the
same parameters and the possibility of the sign error in their computation is
avoided.

142

148 149 150 151 152 153 154 155 156 157 158