Page 243 -
P. 243
5.12 Hopfield Networks 23 1
In practical applications, a robust estimate of the number of prototype patterns
that can be retrieved with a low error rate is d110. For instance, for c=10 one must
employ a Hopfield net with at least d=100 neurons. The prototype patterns must
also be carefully chosen, with small correlations among them, in order to obtain the
best performance.
Figure 5.55 shows a set of 8 prototype patterns, drawn in a 12x10 grid, presented
in the work of Lippmann (1987) and specially designed to produce good
performance. Notice that the digit patterns are drawn in such a way that the amount
of correlation among them is kept low. The number of classes is also lower than
dl10 = 12. One obtains, therefore, quite good results even for heavily noise
corrupted patterns. An example is shown in Figure 5.56 with a noise corrupted
"nine". The network converges to the correct prototype after a few cycles.
Figure 5.56. A noise corrupted "nine" converges after a few steps (left to right) to
the correct prototype.
5.13 Modular Neural Networks
The principle of "divide and conquer", already applied to the decision tree design
in section 4.6, can also be applied to obtain improved neural net solutions.
Consider the CTG dataset with the difficult 10-class discrimination problem solved
with an MLP in section 5.7.1. In this solution a considerable overlap is observed
for classes A, FS and SUSP. One could, therefore, consider developing a
hierarchical approach for this classification task, starting with a two-class splitting
as shown in Figure 5.57.
Let us consider the design of the left side of the tree. Table 5.9 shows the
classification matrices that were achieved using neural nets derived by Statistica,
and trained with the gradient conjugate method. The neural net used for the first
level two-class discrimination is an MLP6:7: 1, with features LB, AC, MLTV, DL,
MEAN and MEDIAN. For the three-class discrimination, at the second level, an
MLP9:5:3 with features LB, AC, UC, ASTV, MSTV, ALTV, MEDIAN, MEAN
and T was employed. Notice that these neural nets are less complex than the one
used in section 5.7.1 and, therefore, substantially easier to train and generalize (see
Exercise 5.26).