Page 230 - Artificial Intelligence in the Age of Neural Networks and Brain Computing

P. 230

2. The Neural Network Approach 221

come from other neurons and its output is distributed to other nodes. The organiza-
tion of the neurons is hierarchical if the nodes are structured in layers. This kind of
topology is somehow reminiscent of the organization of pyramidal neurons in the
mammalian brain. Other kinds of topologies have been considered, in particular,
the maps and the grids, where the neurons are organized in 2D or 3D distributions.
The most typical neural network architecture is referred to as multilayer perceptron
(MLP): here the neurons are organized in successive layers. A pictorial illustration
of the MLP NN is reported in Fig. 11.1A. The MLP is quite similar to the ADALINE
previously proposed in signal processing literature apart from the presence of the
nonlinearities at least in the hidden layer’s nodes. Various nonlinear functions
have been proposed for approximation, pattern recognition, and classiﬁcation
problems. In MLP the nodes in successive layers are connected and the connections
are weighted. Among them monotonic saturating functions (i.e., sigmoids) are a fa-
vorite choice. Thus, the MLP is of feedforward type. However, in recursive networks
there are also feedback links and this enriches the dynamical behavior of NNs [1].
The interest of MLP in approximating inputeoutput mappings has been moti-
vated by the proofs of some theorems demonstrating that any continuous mapping
can be approximated by MLP with at least one hidden layer whose output functions
are sigmoid (or monotonic) functions [2]. It is worth mentioning that this notion of
universal approximation just states that the NN can learn, not what in practice it
really does learn.
NNs are adaptive systems that are trained aiming to derive an optimal represen-
tation of the weights’ matrices. The training is carried out through a speciﬁc
“learning” procedure. The learning can be supervised (SL), unsupervised (UL),
semisupervised (SSL), or reinforced (RL). In SL, NNs are forced to associate their
outputs to real (or complex) valued targets ﬁxed by a “teacher,” through a procedure
(typically gradient-based) that optimizes approximation errors (a “cost” function).

FIGURE 11.1
(A) A shallow (one hidden layer) and (B) a deep (multiple hidden layers) neural network.

225 226 227 228 229 230 231 232 233 234 235