Page 244 -
P. 244

232    5 Neural Networks


                                           (A, B, C, D, AD, DE, LD, SHIFT, SUSP, FS)



                                       w,={A, FS, SUSP)      w2={B, C,  D, AD, DE, LD, SHIFT)



                                      A      FS   SUSP
                              Figure 5.57. First levels of a tree classifier for the CTG data, with merged classes
                              A, FS and SUSP.



                                Looking  at Table 5.9 results,  we  see that in  fact some improvement has been
                              made  over  the  previous  solution  concerning  the  discrimination  of  those  three
                              classes. In general, the hierarchical network approach can achieve better results if
                              the  simplified  discriminations at  the top  levels can  be  solved in  a very  efficient
                              way.  Identification  of  clusters  in  the data,  factor  analysis  and  multidimensional
                              scaling may be used in order to judge if there are well-separated groups of classes,
                              appropriate for top-level discrimination.
                                 Another motivation to use a modular approach occurs when searching for the
                              "best" neural net solution to a given problem. During the search process one often
                              derives alternative solutions that are discarded. These solutions may use the same
                              inputs  with  different  initial  weights,  or  use  alternative  input  sets.  Discarding
                              solutions may not be the most reasonable approach, namely if  these solutions can
                              add  complementary  information to  the problem at  hand.  Instead,  we  may  profit
                              from  the  complementary  characteristics  of  these  nets,  and  achieve  a  better
                              performing solution by using an ensemble of  neural networks. We can do this by
                              establishing a voting scheme based on the net outputs, as shown in Figure 5.58.



                              Table  5.9.  MLPs  classification  matrices  for  the  class  discriminations  shown in
                              Figure 5.57. True classifications along the columns; predicted classifications along
                              the rows.
                                (a)                           (b)
                                                                                        SUSP



                                                              SUSP
                                                                       97.14%   85.51 %   99.49%
                                                                       92.06%   81.04%   94.29%

                              PC - Probability of correct classification (training set estimate) at each level
                              P,  - Total probability of  correct classification (training set estimate for classes A, FS and
                               SUSP).
   239   240   241   242   243   244   245   246   247   248   249