Page 341 - Applied Numerical Methods Using MATLAB
P. 341

330    OPTIMIZATION
           7.1.5  Newton Method
           Like the steepest descent method, this method also uses the gradient to search for
           the minimum point of an objective function. Such gradient-based optimization
           methods are supposed to reach a point at which the gradient is (close to) zero.
           In this context, the optimization of an objective function f(x) is equivalent to
           finding a zero of its gradient g(x), which in general is a vector-valued function
           of a vector-valued independent variable x. Therefore, if we have the gradient
           function g(x) of the objective function f(x), we can solve the system of nonlinear
           equations g(x) = 0 to get the minimum of f(x) by using the Newton method
           explained in Section 4.4.
              The backgrounds of this method as well as the steepest descent method can
           be shown by taking the Taylor series of, say, a two-variable objective function
           f(x 1 ,x 2 ):


                                       ∂f  ∂f         x 1 − x 1k
                       ∼
               f(x 1 ,x 2 ) = f(x 1k ,x 2k ) +
                                      ∂x 1 ∂x 2  (x 1k ,x 2k )  x 2 − x 2k
                                         2    2    2
                                        ∂ f/∂x
                 1                 	          1   ∂ f/∂x 1 ∂x 2       x 1 − x 1k
              +    x 1 − x 1k  x 2 − x 2k  2         2    2
                 2                    ∂ f/∂x 2 ∂x 1  ∂ f/∂x 2    (x 1k ,x 2k )  x 2 − x 2k
                                                   1
                       ∼              T                    T  2     [x − x k ]
                   f(x) = f(x k ) +∇f(x) | x k  [x − x k ] + [x − x k ] ∇ f(x)| x k
                                                   2
                                                1       T
                                      T
                       f(x) = f(x k ) + g [x − x k ] + [x − x k ] H k [x − x k ]  (7.1.10)
                           ∼
                                      k
                                                2
                                                                       2
           with the gradient vector g k =∇f(x)| x k  and the Hessian matrix H k =∇ f(x)| x k .
           In the light of this equation, we can see that the value of the objective function at
                       updated by the steepest descent algorithm described by
           point x k+1
           Eq. (7.1.9)
                                      (7.1.9)
                                 x k+1  =   x k − α k g k /||g k ||
           is most likely smaller than that at the old point x k , with the third term in
           Eq. (7.1.10) neglected.
                        ∼         T                       T
                 f(x k+1 ) = f(x k ) + g [x k+1 − x k ] = f(x k ) − α k g g k /||g k ||
                                                          k
                                  k
                                                                        (7.1.11)
                                      T
                               ∼
                 f(x k+1 ) − f(x k ) = −α k g g k /||g k || ≤ 0 ⇒ f(x k+1 ) ≤ f(x k )
                                      k
           Slightly different from this strategy of the steepest descent algorithm, the Newton
           method tries to go straight to the zero of the gradient of the approximate objective
           function (7.1.10)
                           g k + H k [x − x k ] = 0,  x = x k − H −1    (7.1.12)
                                                           k  g k
           by the updating rule
                                     x k+1 = x k − H  −1                (7.1.13)
                                                 k  g k
                                                                       2
           with the gradient vector g k =∇f(x)| xk  and the Hessian matrix H k =∇ f(x)| xk
           (Appendix C).
   336   337   338   339   340   341   342   343   344   345   346