Page 60 -
P. 60

46      2 Pattern Discrimination

       for the H values. For the cork stopper features all p values are zero, therefore Table
       2.1 lists only  the H  values  from  the most discriminative feature. ART, to  the  less
       discriminative, N.



       Table 2.1. Cork stoppers features in descending order of Kruskal-Wallis'H  (three
       classes).

                Feature                                           H
                ART                                             121.6
                PRTM                                            117.6
                PKT                                             115.7
                ARTG                                            115 7
                ARTM                                            1 11.5
                PUG                                             113.3
                 RA                                             105.2
                 NG                                             104.4
                 RN                                             94.3
                N                                               74.5





       2.6  The Dimensionality Ratio Problem

       In  section  2.1.1 we saw  that  a complex  decision  surface  in  a low  dimensionality
       space can  be  made  linear  and  therefore much  simpler, in  a higher  dimensionality
       space.  Let  us  take  another  look  at  formula  (2-14a). It  is  obvious  that  by  adding
       more  features  one  can  only  increase  the  distance  of  a  pattern  to  a  class  mean.
       Therefore,  it  seems  that  working  in  high  dimensionality  feature  spaces,  or
       equivalently, using arbitrarily complex decision surfaces (e.g. Figure 2.5), can only
        increase the classification  or regression performance.
          However,  this  expected  performance  increase  is  not  verified  in  practice.  The
       reason  for this counterintuitive result, related to the reliable estimation of classifier
       or  regressor  parameters,  is  one of  the  forms  of  what  is  generally  known  as  the
       curse qf  dirnensionalip.
          In  order to  get some insight  into what  is  happening, let us  assume that we are
       confronted  with  a  two-class  classification  problem  whose  data  collection
       progresses slowly. Initially we only have available n=6 patterns for each class. The
       patterns are represented  in  a d=2 feature space as shown in  Figure 2.23. The two
        classes seem linearly separable with 0% error. Fine!
          Meanwhile,  a few more cases were gathered and we now  have available n=IO
        patterns.  The  corresponding  scatter  plot  is  shown  in  Figure  2.24.  Hum!  ... The
        situation  is  getting  tougher.  It  seems  that,  after  all,  the  classes  were  not  so
        separable as we had imagined .... However, a quadratic decision function still seems
        to provide a solution  ...
   55   56   57   58   59   60   61   62   63   64   65