Page 145 -
P. 145

132    4 Statistical Classification


                                and node classification based on the separability properties of the features. Notice
                                from (4-50) that in  order to obtain a class classification performance that is better
                                than  the  one obtained by  a non-hierarchical approach, one must  have very  high
                                performances  at  each  node.  For  instance,  if  for  the  tree  in  Figure  4.38,  both
                                Pc(n(12)l I,) and PC(& (  12) have a value of 0.94, then PC(&) = 0.94~ = 0.88. With a
                                larger tree if  this 0.94 correct classification rate is iterated 4 times one obtains an
                                error of 22%! The error can therefore degrade drastically along a tree path.
                                  Let  us  now  illustrate a practical tree classifier design using  the Breast  Tissue
                                dataset (electric impedance measurements of  freshly excized breast tissue) with 6
                                classes  denoted  car  (carcinoma), fad  (fibro-adenoma),  gla  (glandular),  mas
                                (mastopathy), con (connective) and adi (adipose). Some features of this dataset can
                                be well modeled by  a normal distribution in some classes, namely 10, AREA-DA
                                and  IPMAX. Performing a Kruskal-Wallis analysis, it is readily seen that all the
                                features  have  discriminative capabilities and  that  it  is  practically  impossible to
                                discriminate between classes gla, fad and mas. The low dimensionality ratio of this
                                dataset  for  the  individual  classes  (e.g.  only  14  cases  for  class  con)  strongly
                                suggests a decision tree approach, with  the use of  merged  classes and  a greatly
                                reduced number of features at each node.






                                                                                  iC  CLASS: car
                                                                                  +  CLASS:fad
                                                                                  0  CLASS: mas
                                                                                  A  CLASS: gla
                                                                                     CLASS: con











                                                                   .                          1
                                          -200    300     800     1300    1800    2300    2800
                                                                   10
                                Figure  4.39.  Scatter  plot  of  six  classes  of  breast  tissue  using  features  I0  and
                                PA500.




                                   As  I0  and  PA500  are  promising  features,  it  is  worthwhile  to  look  at  the
                                respective  scatter  diagram  shown  in  Figure  4.39.  Two  clusters  are  visually
                                identified:  one  corresponding  to  {con, adi}, the  other  to  {mas, gla, fad,  car}.
   140   141   142   143   144   145   146   147   148   149   150