Page 48 -
P. 48

34     2 Pattern Discrimination


         to appreciating visually. As can be seen in Figure 2.13 there is a certain amount of
        correlation  between  the  features.  One  might  wonder  what  would  happen  if  the
        features were not  measured  in  approximately the same ranges, if  for instance we
         used  the original PRT feature. We can do this by  increasing the PRTlO scale ten
         times as shown in Figure 2.14, where we also changed the axes orientation in order
         to reasonably fit the plot in the book sheet.







           -
             0       20       40      80       80      100      120      140
                                        PRTlO

         Figure  2.14.  Scatter  plot  for  two  classes  of  cork  stoppers  with  PRTlO  scale
         increased ten times.



           It  is  now  evident  that  the  measurement  unit  has  a  profound  influence  on  the
         Euclidian distance  measures,  and  in  the  visual  clustering of  the  patterns  as well.
         Namely,  in  Figure  2.14,  the  contribution  of  N  to  class  discrimination  is,  in  the
         Euclidian distance sense, negligible.
           The usual  form of  equalizing the features contributions consists of  performing
         some scaling operation. A well-known  scaling method consists of  subtracting the
         mean and dividing by the standard deviation:




         where mi is the sample mean and  s, is the sample standard deviation of feature x,.
           Using  this common scaling method  (yielding features with  zero mean and unit
         variance), the squared Euclidian  distance of the scaled feature vector y relative to
         the origin is expressed as:






           Thus,  the  original  squared  Euclidian  distance  (x, - m, )2 has  been  scaled  by
         11 s:  , shrinking  large  variance features  with  s, >  1 and  stretching  low  variance
         features  with  s, <  I,  therefore  balancing  the  feature  contribution  to  the  squared
         Euclidian  distance.  We  can  obtain  some  insight  into  this  scaling  operation  by
         imagining  what  it  would  do  to  an  elliptical  cluster  with  axes  aligned  with  the
         coordinate  axes.  The  simple  scaling  operation  of  (2-14a)  would  transform
         equidistant  ellipses  into  equidistant  circles.  As  a  matter  of  fact,  any  linear
   43   44   45   46   47   48   49   50   51   52   53