Page 200 - Neural Network Modeling and Identification of Dynamical Systems
P. 200

5.4 HOMOTOPY CONTINUATION TRAINING METHOD FOR SEMIEMPIRICAL ANN-BASED MODELS  191
                               ∂H a (τ(s),w(s)) dτ(s)                  evaluation of the error function Hessian (5.48)
                                    ∂τ        ds                       at each step, which incurs a significant com-
                                   ∂H a (τ(s),w(s)) dw(s)              putational burden. The quasi-Newton methods
                                 +                     = 0.    (5.51)
                                        ∂w         ds                  allow for a faster estimation of the error func-
                                                                       tion Hessian, but the accuracy of these esti-
                          If we introduce an additional constraint of the
                          form                                         mates might be insufficient. Unfortunately, the
                                                                       Gauss–Newton approximation cannot be uti-
                                              T
                                dτ(s) 2  dw(s) dw(s)                   lized, because it assumes the positive semidef-
                                      +              = 1,      (5.52)  initeness of the Hessian. However, under the
                                  ds      ds     ds
                                                                       additional assumption that the error function
                          then the parameter s will represent the arc  Hessian  ∂H a (τ,w)  has full rank at all points of the
                                                                                 ∂w
                          length of γ . Thus, we can trace γ(s) by solving  solution curve γ , the following properties hold.
                          the initial value problem                                                       ∂H a (τ,w)
                                                                       First, all the eigenvalues of the Hessian
                                                                                                             ∂w
                                                                       never change their sign along the γ curve. Since
                           ⎛                     ⎞⎛       ⎞
                             ∂H a (τ,w)  ∂H a (τ,w)  dτ(s)
                                                                !      all the eigenvalues are positive at (0,a), they re-
                           ⎜    ∂τ         ∂w    ⎟⎜   ds  ⎟     0
                               dτ(s)     dw(s)                  1
                           ⎝                   T ⎠⎝  dw(s) ⎠ =    ,    main positive at all points of γ (see [36]). This
                                ds         ds         ds               means that all points of γ , including the so-
                                                                                                       ∗
                             τ(0) = 0, w(0) = a.               (5.53)  lution of the original problem (1,w ), actually
                                                                       represent the local minima of the error function
                          As shown in [29], the arc length parametrization  for each fixed τ. Thus, the iterative corrector
                          of curve γ is optimal in the sense that the associ-  process may be implemented as a minimiza-
                          ated system of linear equations has the smallest  tion process for the error function with respect
                          possible condition number.                   to w, while keeping τ fixed. Also, the efficient
                            The initial value problem can be solved by  Gauss–Newton Hessian approximation may be
                          various methods, both explicit and implicit.  utilized. Finally, the parameter τ monotonically
                          Note that although the global truncation error  increases along the curve γ (i.e., the curve has no
                          of the initial value problem solution inevitably  turning points with respect to τ). Therefore, the
                          accumulates as we trace the curve, we can sig-  solution curve may be parametrized with τ in-
                          nificantly reduce it by applying the iterative  stead of arc length s. In this case, the homotopy
                          corrector process which converges to the so-  continuation is performed by solving the initial
                          lution curve γ . This correction procedure is  value problem for Davidenko’s system of ODEs,
                          based on the fact that each point of γ satis-  i.e.,
                          fies the equation system H a (τ,w) = 0. Hence,
                          given a point (˜τ, ˜ w) which lies in the neigh-
                                                                           w(0) = a,
                          borhood of γ , we can find a closest point of
                          γ by solving the following optimization prob-     dw     ∂H a (τ,w) −1  ∂H a (τ,w)  (5.55)
                                                                                =−                     .
                          lem:                                              dτ        ∂w          ∂τ

                                       2
                                                  2
                             min (˜τ − τ) +  ˜ w − w  | H a (τ,w) = 0 .  This simple version of a homotopy contin-
                             τ,w
                                                               (5.54)  uation training algorithm is summarized be-
                                                                       low (see Algorithm 1). The iterative correc-
                            We need to mention that the numerical con-  tor process is implemented as a Levenberg–
                          tinuation method described above requires the  Marquardt method for minimization of the error
   195   196   197   198   199   200   201   202   203   204   205