Page 191 - Neural Network Modeling and Identification of Dynamical Systems
P. 191
182 5. SEMIEMPIRICAL NEURAL NETWORK MODELS OF CONTROLLED DYNAMICAL SYSTEMS
that the Hessian-vector products of the form and
∂ ˆ x(t,w)
n x 2ˆ .
∂ f i (ˆ x(t,w),u(t),w) ∂w
λ i (t,w)
∂w 2
i=1 Moreover, this method is only applicable when
the whole training set is available beforehand,
can be computed significantly faster by a com- which is not the case with real-time adaptation
bination of forward and reverse sweeps, in con- problems.
trast to a straightforward approach that relies on The error function value is estimated via nu-
evaluation of a full second derivative tensor merical integration of the definite integral in
Eq. (5.8). Note that the target values of observ-
n x
∂ f i (ˆ x(t,w),u(t),w) able outputs are usually available only at some
2ˆ
∂w 2 discrete time instants; thus, if the numerical in-
i=1
tegration procedure requires the integrand eval-
by forward sweeps. The abovementioned al- uation at other time instants, we need to per-
gorithmic differentiation methods are imple- form interpolation or approximation of the cor-
mented in a number of software packages, such responding vector-valued function of time ˜y(t).
as CppAD [26]and ADOL-C[27]. Estimates of the state variable values ˆ x(t,w)
The continuous time versions of RTRL and given by the model are computed by numeri-
BPTT algorithms share the same advantages and cal solution of the initial value problem for the
disadvantages as their discrete time counter- system of ODEs (5.2). In a special case, when all
(p)
(p)
parts. Namely, the backward-in-time method is thetimesteps (t k+1 − t k ) are equal and the def-
more algorithmically efficient since it does not inite integral value is approximated by the right
require the computation of second-order sensi- Riemann sum with the same time step, the error
tivities function value estimate given by (5.8)isequiv-
alent to the usual sum of the squared residuals
2
n x
∂ ˆ x i (t,w)
error function. Also note that all these compu-
∂w 2 i=1 tations for each trajectory may be performed in
parallel.
for all state variables, but only the computation
of first-order sensitivities Despite the fact that the abovementioned
equations for the error function as well as its
∂λ(t,w) derivatives are exact, the application of numer-
∂w ical methods for the solution of initial value
problems, estimation of definite integrals, and
for Lagrange multipliers. For instance, in the interpolation of observable output target values
case of an aircraft longitudinal angular mo- introduces some error to the computed values.
tion modeling problem, the backward-in-time The size of this error depends on the specific nu-
algorithm speeds up the computation of the er- merical methods used, as well as the magnitude
ror function Hessian for a semiempirical neu- of integration time steps and the sampling pe-
ral network–based model with 472 weights by riod of measurements. In the rest of this section,
a factor of 11.8 as compared to a forward-in- we analyze the asymptotic behavior of these
time algorithm. This speedup further increases errors for the forward-in-time method with re-
as the number of parameters n w grows. On the spect to the magnitude of time steps, assuming
other hand, the backward-in-time method re- the lack of irreducible errors (i.e., the lack of
quires additional memory for storage of ˆ x(t,w) measurement noise which implies y(t) =˜y(t)).