Page 243 -
P. 243

5.12 Hopfield Networks   23 1

                            In practical applications, a robust estimate of  the number of prototype patterns
                          that can be retrieved with a low error rate is d110. For instance, for c=10 one must
                          employ a Hopfield net  with at least d=100 neurons. The prototype patterns  must
                          also be carefully chosen, with small correlations among them, in order to obtain the
                          best performance.
                            Figure 5.55 shows a set of 8 prototype patterns, drawn in a 12x10 grid, presented
                          in  the  work  of  Lippmann  (1987)  and  specially  designed  to  produce  good
                          performance. Notice that the digit patterns are drawn in such a way that the amount
                          of  correlation among them is kept low. The number of  classes is also lower than
                          dl10  =  12.  One  obtains,  therefore,  quite  good  results  even  for  heavily  noise
                          corrupted patterns.  An  example  is shown  in  Figure  5.56 with  a noise  corrupted
                          "nine". The network converges to the correct prototype after a few cycles.















                           Figure 5.56.  A noise corrupted "nine" converges after a few steps (left to right) to
                           the correct prototype.





                           5.13  Modular Neural Networks

                           The principle of "divide and conquer", already applied to the decision tree design
                           in  section  4.6,  can  also  be  applied  to  obtain  improved  neural  net  solutions.
                           Consider the CTG dataset with the difficult 10-class discrimination problem solved
                           with  an MLP in  section 5.7.1. In  this solution a considerable overlap is observed
                           for  classes  A,  FS  and  SUSP.  One  could,  therefore,  consider  developing  a
                           hierarchical approach for this classification task, starting with a two-class splitting
                           as shown in Figure 5.57.
                             Let  us  consider  the  design  of  the  left  side of  the  tree.  Table  5.9 shows  the
                           classification matrices that were achieved using neural nets derived by Statistica,
                           and  trained with  the gradient conjugate method. The neural net  used for the first
                           level two-class discrimination is an MLP6:7: 1, with features LB, AC, MLTV, DL,
                           MEAN  and MEDIAN. For the three-class discrimination, at the second level, an
                           MLP9:5:3  with  features LB, AC, UC, ASTV, MSTV, ALTV, MEDIAN, MEAN
                           and T was employed. Notice that these neural nets are less complex than the one
                           used in section 5.7.1 and, therefore, substantially easier to train and generalize (see
                           Exercise 5.26).
   238   239   240   241   242   243   244   245   246   247   248