Page 190 - Neural Network Modeling and Identification of Dynamical Systems
P. 190
5.3 SEMIEMPIRICAL ANN-BASED MODEL DERIVATIVES COMPUTATION 181
simplify (5.23), we define them as Differentiating the initial value problem for the
first-order sensitivities (5.24) with respect to pa-
λ(¯ t,w) = 0,
rameters w yields the initial value problem for
T
ˆ
dλ(t,w) ∂f(ˆ x(t,w),u(t),w) the second-order sensitivities, i.e.,
=− λ(t,w) (5.24)
dt ∂ ˆ x
∂e(˜y(t), ˆ x(t,w),w) ∂λ(¯ t,w)
− . = 0,
∂ ˆ x ∂w
2
The ODEs in (5.24) are called the adjoint equa- d ∂λ(t,w) ∂ e(˜y(t), ˆ x(t,w),w)
=−
tions. Note that this initial value problem is to be dt ∂w ∂ ˆ x∂w
solved in reverse time. Further simplification of ˆ T
∂f(ˆ x(t,w),u(t),w) ∂λ(t,w)
(5.23) comes from the fact that the initial value −
∂ ˆ x ∂w
of state variables does not depend on parame-
n x ∂ f i (ˆ x(t,w),u(t),w)
2ˆ
ters w; hence ∂ ˆ x(0,w) ≡ 0. Finally, we have − λ i (t,w)
∂w
∂ ˆ x∂w
i=1
∂E(w) ∂L(w)
2
= ∂ e(˜y(t), ˆ x(t,w),w)
∂w ∂w −
∂ ˆ x 2
¯ t
∂e(˜y(t), ˆ x(t,w),w) n x 2ˆ
= ∂ f i (ˆ x(t,w),u(t),w) ∂ ˆ x(t,w)
∂w + λ i (t,w) 2 .
0 i=1 ∂ ˆ x ∂w
T (5.27)
∂f(ˆ x(t,w),u(t),w)
ˆ
+ λ(t,w)dt.
∂w
(5.25) Again, the initial value problem (5.27)istobe
solved in reverse time. Thus, in order to com-
Then, the Hessian of the Lagrange function
pute the gradient and Hessian of the Lagrange
equals
function, we first need to solve the initial value
2
2
∂ E(w) ∂ L(w) problems for (5.2)and(5.16) forward-in-time,
=
∂w 2 ∂w 2 next we need to solve the initial value problems
¯ t for adjoint equations (5.24)and (5.27) in reverse
2
∂ e(˜y(t), ˆ x(t,w),w)
= time, and finally we need to integrate (5.25)and
∂w 2 (5.26).
0
T In the above equations, expressions for de-
ˆ
∂f(ˆ x(t,w),u(t),w) ∂λ(t,w) ˆ
+ rivatives of vector-valued functions f and ˆ g with
∂w ∂w respect to state variables ˆ x as well as parame-
n x 2ˆ
∂ f i (ˆ x(t,w),u(t),w) ters w are assumed to be known. In the case
+ λ i (t,w)
∂w 2 these vector-valued functions are represented by
i=1
2 layered feedforward neural networks, the corre-
∂ e(˜y(t), ˆ x(t,w),w)
+ sponding derivatives are presented in Chapter 2.
∂w∂ ˆ x
Otherwise, if they are represented by more gen-
n x
2ˆ
∂ f i (ˆ x(t,w),u(t),w) ∂ ˆ x(t,w) eral semiempirical models, their derivatives at
+ λ i (t,w) dt.
∂w∂ ˆ x ∂w required points may be computed by utilizing
i=1
(5.26) the algorithmic differentiation technique [25]. Note