Page 191 - Neural Network Modeling and Identification of Dynamical Systems
P. 191

182           5. SEMIEMPIRICAL NEURAL NETWORK MODELS OF CONTROLLED DYNAMICAL SYSTEMS

                         that the Hessian-vector products of the form  and
                                                                                        ∂ ˆ x(t,w)
                                 n x        2ˆ                                                 .
                                           ∂ f i (ˆ x(t,w),u(t),w)                        ∂w
                                    λ i (t,w)
                                                  ∂w 2
                                 i=1                                  Moreover, this method is only applicable when
                                                                      the whole training set is available beforehand,
                         can be computed significantly faster by a com-  which is not the case with real-time adaptation
                         bination of forward and reverse sweeps, in con-  problems.
                         trast to a straightforward approach that relies on  The error function value is estimated via nu-
                         evaluation of a full second derivative tensor  merical integration of the definite integral in
                                                                      Eq. (5.8). Note that the target values of observ-

                                                        n x
                                    ∂ f i (ˆ x(t,w),u(t),w)           able outputs are usually available only at some
                                     2ˆ
                                           ∂w 2                       discrete time instants; thus, if the numerical in-
                                                        i=1
                                                                      tegration procedure requires the integrand eval-
                         by forward sweeps. The abovementioned al-    uation at other time instants, we need to per-
                         gorithmic differentiation methods are imple-  form interpolation or approximation of the cor-
                         mented in a number of software packages, such  responding vector-valued function of time ˜y(t).
                         as CppAD [26]and ADOL-C[27].                 Estimates of the state variable values ˆ x(t,w)
                            The continuous time versions of RTRL and  given by the model are computed by numeri-
                         BPTT algorithms share the same advantages and  cal solution of the initial value problem for the
                         disadvantages as their discrete time counter-  system of ODEs (5.2). In a special case, when all
                                                                                     (p)
                                                                                          (p)
                         parts. Namely, the backward-in-time method is  thetimesteps (t k+1  − t k  ) are equal and the def-
                         more algorithmically efficient since it does not  inite integral value is approximated by the right
                         require the computation of second-order sensi-  Riemann sum with the same time step, the error
                         tivities                                     function value estimate given by (5.8)isequiv-
                                                                      alent to the usual sum of the squared residuals
                                           2      
 n x
                                         ∂ ˆ x i (t,w)
                                                                      error function. Also note that all these compu-
                                           ∂w 2    i=1                tations for each trajectory may be performed in
                                                                      parallel.
                         for all state variables, but only the computation
                         of first-order sensitivities                     Despite the fact that the abovementioned
                                                                      equations for the error function as well as its
                                           ∂λ(t,w)                    derivatives are exact, the application of numer-
                                             ∂w                       ical methods for the solution of initial value
                                                                      problems, estimation of definite integrals, and
                         for Lagrange multipliers. For instance, in the  interpolation of observable output target values
                         case of an aircraft longitudinal angular mo-  introduces some error to the computed values.
                         tion modeling problem, the backward-in-time  The size of this error depends on the specific nu-
                         algorithm speeds up the computation of the er-  merical methods used, as well as the magnitude
                         ror function Hessian for a semiempirical neu-  of integration time steps and the sampling pe-
                         ral network–based model with 472 weights by  riod of measurements. In the rest of this section,
                         a factor of 11.8 as compared to a forward-in-  we analyze the asymptotic behavior of these
                         time algorithm. This speedup further increases  errors for the forward-in-time method with re-
                         as the number of parameters n w grows. On the  spect to the magnitude of time steps, assuming
                         other hand, the backward-in-time method re-  the lack of irreducible errors (i.e., the lack of
                         quires additional memory for storage of ˆ x(t,w)  measurement noise which implies y(t) =˜y(t)).
   186   187   188   189   190   191   192   193   194   195   196