Page 203 - Neural Network Modeling and Identification of Dynamical Systems
P. 203

194           5. SEMIEMPIRICAL NEURAL NETWORK MODELS OF CONTROLLED DYNAMICAL SYSTEMS

                            In order to solve it numerically, we parame-  longs to a class of evolution strategy algorithms;
                         trize a set of reference maneuvers (5.56)bya  therefore the iterative search procedure relies
                         finite-dimensional parameter vector θ ∈ R , i.e.,  on the mutation, selection, and recombination
                                                              n θ
                                                                      mechanisms. Mutation of the vector of real vari-
                                                         ⎛    ⎞
                                     ⎛  (1)  ⎞             (p)
                                       θ                  θ           ables amounts to the addition of a realization
                                                           1  ⎟
                                                            .
                                                         ⎜
                                     ⎜ . ⎟
                                 θ = ⎝ . ⎠ ,     ¯ x (p) (0) =  ⎜ . ⎟  ,  of a random vector drawn from the multivari-
                                        .
                                                                      ate normal distribution with zero mean and the
                                                         ⎝ . ⎠
                                                           (p)
                                      θ (P)               θ n x       covariance matrix C ∈ R n θ ×n θ  . Thus, the current
                                     ⎛  (p)     ⎞                     value of the vector of parameters can be viewed
                                       θ
                                        n x +kn u +1
                                     ⎜     .    ⎟                     as a mean vector for this normal distribution.
                              (p)
                             ¯ u  (t) =  ⎜  . .  ⎟ ,           (5.60)  Mutation is used to obtain λ   2 candidate pa-
                                     ⎝          ⎠
                                       (p)
                                      θ                               rameter vectors (population). Then, the selection
                                       n x +(k+1)n u
                                                                      and recombination take place: the new value

                                  t ∈  t (p) k, t (p) (k + 1) ,       for the parameter vector is a weighted linear
                                                                      combination of μ ∈[1,λ] best individuals (i.e.,
                                             (p)
                                  k = 0,...,K  − 1,
                                                                      candidate solutions with the lowest objective
                         where  t (p)  =  ¯ t  (p)  and K (p)  ∈ N is given. Thus,  function values) which maximizes their likeli-
                                       K (p)                          hood. Obviously, the values of the elements of
                         each control signal ¯ u (p)  is a piecewise constant  covariation matrix C (also called strategy pa-
                         function of time defined on segments of dura-  rameters) have a significant impact on the effec-
                         tion  t (p)  and parametrized by the correspond-  tiveness of the algorithm. However, the values
                         ing set of step values. The total number of pa-  for strategy parameters leading to the efficient
                                                     P
                                                     (   (p)          search steps are unknown a priori and usually
                         rameters equals n θ = n x P + n u  K  .
                                                    i=1               tend to change during the search. Therefore, it
                            The  resulting  nonlinear  inequality-con-  is necessary to provide some form of adaptation
                         strained optimization problem can be replaced  for the strategy parameters during the search
                         with a series of unconstrained problems using  (hence the name Covariance Matrix Adaptation)
                         the penalty function method. We also adopt a  in order to maximize the likelihood of success-
                         homotopy continuation method that gradually  ful search steps. The covariance matrix adapta-
                         increases the prediction horizon for each trajec-  tion is performed incrementally, i.e., it is based
                         tory in a similar fashion to the algorithm de-  not only on the current population, but also on
                         scribed in Section 5.4.                      the search history that is stored in the vector
                            Since the objective function is discontinu-  p c ∈ R , referred to as the search path. Similarly,
                                                                            n θ
                         ous, the optimization can be performed only  the search path p σ ∈ R n θ  is used for adaptation
                         by means of zero-order algorithms. Numerical  of the step length σ. The Active CMA-ES algo-
                         experiments evidence that the particle swarm  rithm extends the basic algorithm by incorpo-
                         method [48–51]isnot wellsuited forthisprob-  rating the information of the most unsuccessful
                         lem, because the system becomes ill-conditioned  search steps (with negative weights) to the co-
                         for long prediction horizons. Hence, the Co-  variance matrix adaptation step. Note that in the
                         variance Matrix Adaptation Evolution Strategy  case of a convex quadratic objective function,
                         (CMA-ES) method [52–55] seems to be more ap-  the covariance matrix adaptation leads to the
                         propriate for the task.                      matrix proportional to the inverse Hessian ma-
                            The CMA-ES algorithm is a stochastic local  trix, just like the quasi-Newton methods. This
                         optimization method for nonlinear nonconvex  optimization algorithm is scale and rotation in-
                         functions of real variables. This algorithm be-  variant. Its convergence has not been proved for
   198   199   200   201   202   203   204   205   206   207   208