Page 302 - Artificial Intelligence in the Age of Neural Networks and Brain Computing
P. 302
2. Background and Related Work 295
neural network. Neuroevolution is a good approach in particular to POMDP
(partially observable Markov decision process) problems because of recurrency: It
is possible to evolve recurrent connections to allow disambiguating hidden states.
The weights can be optimized using various evolutionary techniques. Genetic al-
gorithms are a natural choice because crossover is a good match with neural net-
works: they recombine parts of existing neural networks to find better ones.
CMA-ES [17], a technique for continuous optimization, works well on optimizing
the weights, as well because it can capture interactions between them. Other ap-
proaches such as SANE, ESP, and CoSyNE evolve partial neural networks and
combine them into fully functional networks [18e20]. Further, techniques such as
Cellular Encoding [21] and NEAT [12] have been developed to evolve the topology
of the neural network, which is particularly effective in determining the required
recurrence. Neuroevolution techniques have been shown to work well in many tasks
in control, robotics, constructing intelligent agents for games, and artificial life [14].
However, because of the large number of weights to be optimized, they are generally
limited to relatively small networks.
Evolution has been combined with gradient descentebased learning in several
ways, making it possible to utilize much larger networks. These methods are still
usually applied to sequential decision tasks, but gradients from a related task
(such as prediction of the next sensory inputs) are used to help search. Much of
the work is based on utilizing the Baldwin effect, where learning only affects the se-
lection [22]. Computationally, it is possible to utilize Lamarckian evolution as well,
that is, encode the learned weight changes back into the genome [21]. However, care
must be taken to maintain diversity so that evolution can continue to innovate when
all individuals are learning similar behavior.
Evolution of DNNs departs from this prior work in that it is applied to supervised
domains where gradients are available, and evolution is used only to optimize the
design of the neural network. Deep neuroevolution is thus more closely related to
bilevel (or multilevel) optimization techniques [23]. The idea is to use an evolu-
tionary optimization process at a high level to optimize the parameters of a low-
level evolutionary optimization process.
Consider for instance the problem of controlling a helicopter through aileron,
elevator, rudder, and rotor inputs. This is a challenging benchmark from the
2000s for which various reinforcement learning approaches have been developed
[24e26]. One of the most successful ones is single-level neuroevolution, where
the helicopter is controlled by a neural network that is evolved through genetic al-
gorithms [27]. The eight parameters of the neuroevolution method (such as mutation
and crossover rate, probability, and amount and population and elite size) are
optimized by hand. It would be difficult to include more parameters because the
parameters interact nonlinearly. A large part of the parameter space thus remains un-
explored in the single-level neuroevolution approach. However, a bilevel approach,
where a high-level evolutionary process is employed to optimize these parameters,
can search this space more effectively [28]. With bilevel evolution, the number of pa-
rameters optimized could be extended to 15, which would result in a significantly