Page 203 - Neural Network Modeling and Identification of Dynamical Systems

P. 203

194 5. SEMIEMPIRICAL NEURAL NETWORK MODELS OF CONTROLLED DYNAMICAL SYSTEMS

In order to solve it numerically, we parame- longs to a class of evolution strategy algorithms;
trize a set of reference maneuvers (5.56)bya therefore the iterative search procedure relies
ﬁnite-dimensional parameter vector θ ∈ R , i.e., on the mutation, selection, and recombination
n θ
mechanisms. Mutation of the vector of real vari-
⎛ ⎞
⎛ (1) ⎞ (p)
θ θ ables amounts to the addition of a realization
1 ⎟
.
⎜
⎜ . ⎟
θ = ⎝ . ⎠ , ¯ x (p) (0) = ⎜ . ⎟ , of a random vector drawn from the multivari-
.
ate normal distribution with zero mean and the
⎝ . ⎠
(p)
θ (P) θ n x covariance matrix C ∈ R n θ ×n θ . Thus, the current
⎛ (p) ⎞ value of the vector of parameters can be viewed
θ
n x +kn u +1
⎜ . ⎟ as a mean vector for this normal distribution.
(p)
¯ u (t) = ⎜ . . ⎟ , (5.60) Mutation is used to obtain λ 2 candidate pa-
⎝ ⎠
(p)
θ rameter vectors (population). Then, the selection
n x +(k+1)n u
and recombination take place: the new value

t ∈ t (p) k, t (p) (k + 1) , for the parameter vector is a weighted linear
combination of μ ∈[1,λ] best individuals (i.e.,
(p)
k = 0,...,K − 1,
candidate solutions with the lowest objective
where t (p) = ¯ t (p) and K (p) ∈ N is given. Thus, function values) which maximizes their likeli-
K (p) hood. Obviously, the values of the elements of
each control signal ¯ u (p) is a piecewise constant covariation matrix C (also called strategy pa-
function of time deﬁned on segments of dura- rameters) have a signiﬁcant impact on the effec-
tion t (p) and parametrized by the correspond- tiveness of the algorithm. However, the values
ing set of step values. The total number of pa- for strategy parameters leading to the efﬁcient
P
( (p) search steps are unknown a priori and usually
rameters equals n θ = n x P + n u K .
i=1 tend to change during the search. Therefore, it
The resulting nonlinear inequality-con- is necessary to provide some form of adaptation
strained optimization problem can be replaced for the strategy parameters during the search
with a series of unconstrained problems using (hence the name Covariance Matrix Adaptation)
the penalty function method. We also adopt a in order to maximize the likelihood of success-
homotopy continuation method that gradually ful search steps. The covariance matrix adapta-
increases the prediction horizon for each trajec- tion is performed incrementally, i.e., it is based
tory in a similar fashion to the algorithm de- not only on the current population, but also on
scribed in Section 5.4. the search history that is stored in the vector
Since the objective function is discontinu- p c ∈ R , referred to as the search path. Similarly,
n θ
ous, the optimization can be performed only the search path p σ ∈ R n θ is used for adaptation
by means of zero-order algorithms. Numerical of the step length σ. The Active CMA-ES algo-
experiments evidence that the particle swarm rithm extends the basic algorithm by incorpo-
method [48–51]isnot wellsuited forthisprob- rating the information of the most unsuccessful
lem, because the system becomes ill-conditioned search steps (with negative weights) to the co-
for long prediction horizons. Hence, the Co- variance matrix adaptation step. Note that in the
variance Matrix Adaptation Evolution Strategy case of a convex quadratic objective function,
(CMA-ES) method [52–55] seems to be more ap- the covariance matrix adaptation leads to the
propriate for the task. matrix proportional to the inverse Hessian ma-
The CMA-ES algorithm is a stochastic local trix, just like the quasi-Newton methods. This
optimization method for nonlinear nonconvex optimization algorithm is scale and rotation in-
functions of real variables. This algorithm be- variant. Its convergence has not been proved for

198 199 200 201 202 203 204 205 206 207 208