Page 76 - Neural Network Modeling and Identification of Dynamical Systems
P. 76
64 2. DYNAMIC NEURAL NETWORKS: STRUCTURES AND TRAINING METHODS
follows: themselves. We have
∂E(W)
λ(t k ) = . (2.86) ∂z(t 0 )
∂z(t k ) ∂W = 0,
Error function sensitivities are computed dur- ∂z(t k ) = ∂F(z(t k−1 ),u(t k−1 ),W)
ing a backward-in-time pass, i.e., ∂W ∂W
∂F(z(t k−1 ),u(t k−1 ),W) ∂z(t k−1 )
+ ,
λ(t K+1 ) = 0, ∂z ∂W
∂e(˜y(t k ),z(t k ),W) k = 1,...,K.
λ(t k ) = (2.90)
∂z (2.87)
∂F(z(t k ),u(t k ),W) T
+ λ(t k+1 ), The gradient of the individual trajectory error
∂z
function (2.84) equals
k = K,...,1.
K
∂E(W) ∂e(˜y(t k ),z(t k ),W)
Finally, the error function derivatives with re- =
spect to parameters are expressed in terms of ∂W ∂W
k=1 (2.91)
sensitivities, i.e., T
∂z(t k ) ∂e(˜y(t k ),z(t k ),W)
+ .
K ∂W ∂z
∂E(W) ∂e(˜y(t k ),z(t k ),W)
=
∂W ∂W A Gauss–Newton Hessian approximation
k=1 may be obtained as follows:
∂F(z(t k−1 ),u(t k−1 ),W) T
+ λ(t k ).
∂W 2 K 2
∂ E(W) ∂ e(˜y(t k ),z(t k ),W)
≈
(2.88) ∂W 2 ∂W 2
k=1
2
∂ e(˜y(t k ),z(t k ),W) ∂z(t k )
First-order derivatives of the instantaneous +
error function (2.85) have the form ∂W∂z ∂W
T
2
∂z(t k ) ∂ e(˜y(t),z(t k ),W)
+
T
∂e(˜y,z,W) ∂G(z,W) ∂W ∂z∂W
=− ˜y − G(z,W) , T 2
∂W ∂W ∂z(t k ) ∂ e(˜y(t k ),z(t k ),W) ∂z(t k )
+ .
T
∂e(˜y,z,W) ∂G(z,W) ∂W ∂z 2 ∂W
=− ˜y − G(z,W) . (2.92)
∂z ∂z
(2.89)
The corresponding approximations to second-
Sine the mappings F and G are represented order derivatives of the instantaneous error
by layered feedforward neural networks, their function have the form
derivatives can be computed as described in Sec- T
2
∂ e(˜y,z,W) ∂G(z,W) ∂G(z,W)
tion 2.2.2. 2 ≈ ,
Real-Time Recurrent Learning algorithm ∂W ∂W ∂W
2
(RTRL) [68–70] for network outputs Jacobian. ∂ e(˜y,z,W) ∂G(z,W) T ∂G(z,W) , (2.93)
≈
The model state sensitivities with respect ∂W∂z ∂W ∂z
2
to network parameters are computed during ∂ e(˜y,z,W) ∂G(z,W) T ∂G(z,W)
the forward-in-time pass, along with the states ∂z 2 ≈ ∂z ∂z .