Page 218 - Compact Numerical Methods For Computers
P. 218

Chapter 17

                      MINIMISING A NONLINEAR SUM OF SQUARES




                                              17.1. INTRODUCTION
                      The mathematical problem to be considered here is that of minimising

                                                                                        (17.1)
                      with respect to the parameters x , j=1, 2, . . . , n (collected for convenience as the
                                                j
                      vector x), where at least one of the functions f (x) is nonlinear in x. Note that by
                                                              i
                      collecting the m functions f (x), i=1, 2, . . . , m, as a vector f, we get
                                            i
                                                            T
                                                     S(x)=f f .                         (17.2)
                      The minimisation of a nonlinear sum-of-squares function is a sufficiently wide-
                      spread activity to have developed special methods for its solution. The principal
                      reason for this is that it arises whenever a least-squares criterion is used to fit a
                      nonlinear model to data. For instance, let y i  represent the weight of some
                      laboratory animal at week i after birth and suppose that it is desired to model this
                      by some function of the week number i, which will be denoted y(i,x), where x is
                      the set of parameters which will be varied to fit the model to the data. If the
                      criterion of fit is that the sum of squared deviations from the data is to be
                      minimised (least squares) then the objective is to minimise (17.1) where
                                                    f (x)=y(i,x)-y  i                   (17.3)
                                                     i
                      or, in the case that confidence weightings are available for each data point,
                                                  f (x)=[y(i,x)-y ]w i                  (17.4)
                                                  i
                                                               i
                      where w , i=1, 2, . . . , m, are the weightings. As a particular example of a growth
                             i
                      function, consider the three-parameter logistic function (Oliver 1964)
                                       y(i,x)=y(i,x ,x ,x )=x /[1+exp( x +i x ) ] .     (17.5)
                                                      3
                                                   2
                                                                    2
                                                           i
                                                1
                                                                         3
                        Note that the form of the residuals chosen in (17.3) and (17.4) is the negative
                      of the usual ‘actual minus fitted’ used in most of the statistical literature. The
                      reason for this is to make the derivatives of f (x) coincide with those of y(i,x).
                                                             i
                        The minimisation of S(x) could, of course, be approached by an algorithm for
                      the minimisation of a general function of n variables. Bard (1970) suggests that
                      this is not as efficient as methods which recognise the sum-of-squares form of
                      S(x), though more recently Biggs (1975) and McKeown (1974) have found
                      contrary results. In the paragraphs below, algorithms will be described which take
                      explicit note of the sum-of-squares form of S(x), since these are relatively simple
                      and, as building blocks, use algorithms for linear least-squares computations
                      which have already been discussed in earlier chapters.
                                                         207
   213   214   215   216   217   218   219   220   221   222   223