Page 181 - Artificial Intelligence in the Age of Neural Networks and Brain Computing
P. 181
2. Brief History and Foundations of the Deep Learning Revolution 171
said that someone should do research to develop more rigorous ways to decide on the
number of layers, and so on. They complain that such choices are usually based on
trying things out, and seeing what happens to error. In fact, such research was done
long ago, and many tools exist which implement rational tools. All the standard
methods of statistics also work. People planning to use ANNs in forecasting or
time-series modeling applications should be sure to understand the rigorous methods
developed in statistics [22], which are also based on trying models out and seeing
what happens to error, in a systematic way. Section 3 will say more about numbers
of layers.
Widespread use of the Convolutional Neural Network (CoNN) was arguably the
most important new direction in the new wave of deep learning which started in
2009e11. The basic CoNN is a variation of the simple feedforward ANN shown
in Fig. 8.5B, varied in a way which makes it possible to handle a much larger number
of inputs. The key idea is illustrated in Fig. 8.8.
The CoNN addresses the special case where the inputs to the neural network are
organized in a regular Euclidean grid, like what we often see in camera images. In a
naı ¨ve ANN, each of the many, many hidden neurons would take inputs from
different regions of the image. That would require estimating many, many parame-
ters to train the network (millions, if there are millions of pixels in the image). From
statistical theory, we know that this would require really huge amounts of data, even
by the standards of big data, to achieve reasonable accuracy.
The key idea in CoNNs is to “reuse” the same hidden neuron “in different
locations.” Equivalently, we could phrase this as “sharing weights” between all
the sibling hidden neurons of the same type handling different parts of the image.
In terms of basic mathematics, the CoNN is exploiting the idea of invariance with
FIGURE 8.8
Schematic of what a CoNN is. See LeCun’s tutorial [23] for more details.