Page 182 - Artificial Intelligence in the Age of Neural Networks and Brain Computing
P. 182
172 CHAPTER 8 The New AI: Basic Concepts, and Urgent Risks
respect to Euclidean translations in order to drastically reduce the number of weights
which must be estimated or trained, and improve accuracy when one only has a finite
amount of data. Note however that this trick only works when the input data do
possess Euclidean symmetry; the cells of our retina do not. Of course, it is easy
enough to apply this same principle to a feedforward ANN with many layers, as
LeCun and others have many times.
A simpler design which has also been crucial to the success of LeCun and others
is the autoencoder or bottleneck network, developed and popularized by Cottrell
long ago [24]. The idea is to train an ordinary feedforward ANN with N layers,
just like the example of Fig. 8.5B (but with more layers sometimes), but to train
it to try to output predictions of the same data which it uses as input. This would
be a trivial prediction task (just set the outputs to equal the known inputs), except
when the hidden neurons on one or more layer are much fewer than the number
of inputs, making it impossible to learn to set all the outputs to equal all the inputs.
The hidden layer (bottleneck layer) of such a network learns to be a kind of
compressed representation of the image. By developing layers and layers of such
compression, by training over a large set of images, LeCun develops a kind of
preprocessing which improves the performance of later stages of prediction and
classification, as basic statistics easily explains.
A neat trick here is that one can train the autoencoders over millions of images
which have not been classified, and use a more limited set of data labeled for the
prediction task itself.
3. FROM RNNs TO MOUSE-LEVEL COMPUTATIONAL
INTELLIGENCE: NEXT BIG THINGS AND BEYOND
3.1 TWO TYPES OF RECURRENT NEURAL NETWORK
The greater use of recurrent neural networks can give a very substantial improve-
ment in neural network performance, but it is extremely important to distinguish
between two types of recurrencedtime-lagged recurrence versus simultaneous
recurrence. Applications which do not distinguish between these two tend to use
training procedures which are inconsistent either in mathematics or in addressing
the tasks which the developer imagines they might address.
The easiest type of recurrent network to understand and use is the Time-Lagged
Recurrent Network (TLRN), depicted in Fig. 8.9.
The key idea is that one can augment any input-output network, such as an
N-layer ANN or a CoNN, by designing it to output additional variables forming a
vector R, which are used only as additional inputs (a kind of memory) for the
next time period.
I still remember the major time-series forecasting competition led by Sven
Crone, which was presented at several major conferences in statistics, forecasting,
neural networks, and computer science, including IJCNN 2007 in Orlando, Florida.