Page 190 - Neural Network Modeling and Identification of Dynamical Systems
P. 190

5.3 SEMIEMPIRICAL ANN-BASED MODEL DERIVATIVES COMPUTATION      181
                          simplify (5.23), we define them as            Differentiating the initial value problem for the
                                                                       first-order sensitivities (5.24) with respect to pa-
                             λ(¯ t,w) = 0,
                                                                       rameters w yields the initial value problem for
                                                       T
                                        ˆ
                           dλ(t,w)     ∂f(ˆ x(t,w),u(t),w)             the second-order sensitivities, i.e.,
                                   =−                   λ(t,w) (5.24)
                              dt              ∂ ˆ x
                                       ∂e(˜y(t), ˆ x(t,w),w)             ∂λ(¯ t,w)
                                     −                 .                         = 0,
                                              ∂ ˆ x                         ∂w
                                                                                      2
                          The ODEs in (5.24) are called the adjoint equa-  d ∂λ(t,w)  ∂ e(˜y(t), ˆ x(t,w),w)
                                                                                 =−
                          tions. Note that this initial value problem is to be  dt  ∂w     ∂ ˆ x∂w
                          solved in reverse time. Further simplification of            ˆ               T
                                                                                     ∂f(ˆ x(t,w),u(t),w) ∂λ(t,w)
                          (5.23) comes from the fact that the initial value        −
                                                                                            ∂ ˆ x        ∂w
                          of state variables does not depend on parame-
                                                                                      n x      ∂ f i (ˆ x(t,w),u(t),w)
                                                                                                2ˆ

                          ters w; hence  ∂ ˆ x(0,w)  ≡ 0. Finally, we have         −    λ i (t,w)
                                        ∂w
                                                                                                     ∂ ˆ x∂w
                                                                                     i=1
                             ∂E(w)   ∂L(w)
                                                                                       2
                                   =                                                  ∂ e(˜y(t), ˆ x(t,w),w)
                              ∂w       ∂w                                          −
                                                                                             ∂ ˆ x 2
                                      ¯ t

                                        ∂e(˜y(t), ˆ x(t,w),w)                n x       2ˆ
                                   =                                                  ∂ f i (ˆ x(t,w),u(t),w) ∂ ˆ x(t,w)
                                              ∂w                          +     λ i (t,w)       2                .
                                     0                                       i=1              ∂ ˆ x         ∂w
                                                        T                                                   (5.27)
                                       ∂f(ˆ x(t,w),u(t),w)
                                        ˆ
                                     +                   λ(t,w)dt.
                                              ∂w
                                                               (5.25)  Again, the initial value problem (5.27)istobe
                                                                       solved in reverse time. Thus, in order to com-
                            Then, the Hessian of the Lagrange function
                                                                       pute the gradient and Hessian of the Lagrange
                          equals
                                                                       function, we first need to solve the initial value
                                     2
                            2
                           ∂ E(w)   ∂ L(w)                             problems for (5.2)and(5.16) forward-in-time,
                                  =
                            ∂w 2     ∂w 2                              next we need to solve the initial value problems
                                     ¯ t                               for adjoint equations (5.24)and (5.27) in reverse
                                        2
                                       ∂ e(˜y(t), ˆ x(t,w),w)
                                  =                                    time, and finally we need to integrate (5.25)and
                                             ∂w 2                      (5.26).
                                    0
                                                     T                   In the above equations, expressions for de-
                                      ˆ
                                     ∂f(ˆ x(t,w),u(t),w) ∂λ(t,w)                                      ˆ
                                   +                                   rivatives of vector-valued functions f and ˆ g with
                                            ∂w           ∂w            respect to state variables ˆ x as well as parame-
                                     n x        2ˆ
                                              ∂ f i (ˆ x(t,w),u(t),w)  ters w are assumed to be known. In the case
                                   +    λ i (t,w)
                                                     ∂w 2              these vector-valued functions are represented by
                                     i=1
                                        2                              layered feedforward neural networks, the corre-
                                      ∂ e(˜y(t), ˆ x(t,w),w)
                                   +                                   sponding derivatives are presented in Chapter 2.
                                            ∂w∂ ˆ x
                                                                       Otherwise, if they are represented by more gen-
                             n x
                                        2ˆ
                                       ∂ f i (ˆ x(t,w),u(t),w) ∂ ˆ x(t,w)  eral semiempirical models, their derivatives at
                           +    λ i (t,w)                        dt.
                                             ∂w∂ ˆ x        ∂w         required points may be computed by utilizing
                             i=1
                                                               (5.26)  the algorithmic differentiation technique [25]. Note
   185   186   187   188   189   190   191   192   193   194   195