Page 153 - Compact Numerical Methods For Computers
P. 153

Chapter 12

                            OPTIMISATION AND NONLINEAR EQUATIONS




                                      12.1. FORMAL PROBLEMS IN UNCONSTRAINED
                                       OPTIMISATION AND NONLINEAR EQUATIONS
                            The material which follows in the next few chapters deals with finding the minima
                            of functions or the roots of equations. The functions or equations will in general
                            be nonlinear in the parameters, that is following Kowalik and Osborne (1968),
                            the problems they generate will not be solvable by means of linear equations
                            (though, as we shall see, iterative methods based on linear subproblems are very
                            important).
                              A special case of the minimisation problem is the nonlinear least-squares
                            problem which, because of its practical importance, will be presented first. This
                            can be stated: given M nonlinear functions of n parameters
                                               f (b ,b , . . . ,b )  i = 1, 2, . . . , M       (12.1)
                                                i
                                                            n
                                                  1
                                                     2
                            minimise the sum of squares
                                                                                               (12.2)
                              It is convenient to collect the n parameters into a vector b; likewise the functions
                            can be collected as the vector of M elements f. The nonlinear least-squares
                            problem commonly, but not exclusively, arises in the fitting of equations to data
                            by the adjustment of parameters. The data, in the form of K variables (where
                            K = 0 when there are no data), may be thought of as occupying a matrix Y of
                            which the j th column y  gives the value of variable j at each of the M data points.
                                                j
                            The terms parameters, variables and data points as used here should be noted,
                            since they lead naturally to the least-squares problem with
                                                f(b  Y) = g(b, y , y , . . . , y,  l  2  K -1 ) – y K  (12.3)
                            in which it is hoped to fit the function(s) g to the variable y K  by adjusting the
                            parameters b. This will reduce the size of the functions f, and hence reduce the
                            sum of squares S. Since the functions f in equation (12.3) are formed as
                            differences, they will be termed residuals. Note that the sign of these is the
                            opposite of that usually chosen. The sum of squares is the same of course. My
                            preference for the form used here stems from the fact that the partial derivatives
                            of f with respect to the parameters b are identical to those of g with respect to the
                            same parameters and the possibility of the sign error in their computation is
                            avoided.


                                                               142
   148   149   150   151   152   153   154   155   156   157   158