Page 307 - Artificial Intelligence in the Age of Neural Networks and Brain Computing

P. 307

300 CHAPTER 15 Evolving Deep Neural Networks

FIGURE 15.2
Top: Simpliﬁed visualization of the best network evolved by CoDeepNEAT for the CIFAR-
10 domain. Node 1 is the input layer, while Node 2 is the output layer. The network has
repetitive structure because its blueprint reuses same module in multiple places. Bottom:
A more detailed visualization of the same network.

CoDeepNEAT trains very fast. While the network of Snoek et al. takes over 30
epochs to reach 20% test error and over 200 epochs to converge, the best network
from evolution takes only 12 epochs to reach 20% test error and around 120 epochs
to converge. This network utilizes the same modules multiple times, resulting in a
deep and repetitive structure typical of many successful DNNs (Fig. 15.2).

4. EVOLUTION OF LSTM ARCHITECTURES

Recurrent neural networks, in particular, those utilizing LSTM nodes, are another
powerful approach to DNN. Much of the power comes from repetition of LSTM
modules and the connectivity between them. In this section, CoDeepNEAT is
extended with mutations that allow searching for such connectivity, and the
approach is evaluated in the standard benchmark task of language modeling.

4.1 EXTENDING CoDeepNEAT to LSTMs
LSTM consists of gated memory cells that can integrate information over longer
time scales (as compared to simply using recurrent connections in a neural network).
LSTMs have recently been shown to be powerful in supervised sequence processing
tasks such as speech recognition [35] and machine translation [36].
Recent research on LSTMs has focused in two directions: ﬁnding variations of
individual LSTM memory unit architecture [37e40] and discovering new ways
of stitching LSTM layers into a network [41e43]. Both approaches have improved
performance over vanilla LSTMs, with best recent results achieved through

302 303 304 305 306 307 308 309 310 311 312