Page 224 -
P. 224
212 5 Neural Networks
crossover and mutation. For a detailed explanation of this issue see, e.g., Vonk el
al. (1997). As a matter of fact, genetic training of neural networks is also plagued
by the local minima problem, and may lead to slow convergence times since
important chromosomal information may take a long time before appearing. The
advantage of genetic training is that it can be applied to a wide variety of neural
architectures, input values and error formulations.
The Neuro-Genetic program, included in the book CD (see Appendix B), is a
tool for designing neural networks with either genetic or back-propagation
algorithms, allowing a comparison of the two approaches. The two-class cork
stoppers problem was analysed with the Neuro-Genetic for an MLP2:I:l
configuration. Two features were used (N, PRT10) and the cases were equally
divided for training and testing (50 cases each). Using an initial population of 10
chromosomes with P,,,=P,=O.l, 1-point crossover and elitism, a test error estimate
of 10% was achieved, similar to the performance provided by back-propagation.
Genetic algorithms also have other applications, namely for generating and
analysing neural nets. An interesting application is the combination of genetic
algorithms with the probabilistic neural nets, presented in section 4.3, for
performing feature selection quickly. The genetic algorithm then provides a wide
search in the feature space. Several datasets analysed in this chapter underwent
feature selection using this method. In the many experiments performed it was
found that this method of feature selection tends to discard too many features. For
instance, for the foetal weight problem, the method only found feature AP as a
useful feature, although at least two more features are definitely useful, as shown
in sections 5.6 and 5.7.
5.9 Radial Basis Functions
The radial basis functions approach constitutes an alternative feed-forward
architecture to the two-layer MLP, for performing classification or regression
tasks. It is based on the exact interpolation method of determining a function h(x)
that will fit the target values ti:
The radial basis functions approach for solving this problem consists of
approximating h(x) by a weighted series of a kernel function dd), which depends
on the distance d of a feature vector x to a prototype vector xi:
Note the striking similarity between this formula and the formula of the
generalized decision function (2-4), or the formula of the Parzen window estimate
(4-36).