Page 138 -
P. 138

4.4 Feature Selection   125


                         removed.  One  must  be  quite  conservative, however,  in  the  specification of  the
                         tolerance. A value at least as low as 1% is standard practice.
                           Figure 4.36  shows the summary of a forward search for the first two classes of
                         the  cork  stoppers  data  obtained  with  Statistics,  using  default  values  for  the
                         tolerance (0.01) and F (1.0). The Wilks' lambda indicated in Figure 4.36 is equal to
                         the  determinant of  the pooled covariance divided by  the determinant of the total
                         covariance. Physically, it can be interpreted as the ratio between the average class
                         volume  and  the  total  volume  of  the  cluster  constituted  by  all  the  patterns.
                         Therefore, it reflects the class separability. The F statistic is computed from the
                         Wilks' lambda.
                           The four-feature solution shown in Figure 4.36 corresponds to the classification
                         matrix shown before in Figure 4.24b.
                           Using  a  backward  search,  the  solution  presented  previously  with  only  two
                         features (N  and  PRT)  was  obtained (see Figure  4.8).  Notice  that  the  backward
                         search usually needs to start with a very  low tolerance value (in the present case
                         T=0.002 is sufficient).
                           It was  already shown that this classifier solution uses a pooled covariance not
                         too far from the individual covariance matrices. Also, the dimensionality ratio is
                         comfortably  high:  n/d=25.  One  can  therefore  be  confident  that  this  classifier
                         performs in a nearly optimal way.




























                          Figure 4.37. Feature selection using a dynamic search on  the cork stoppers data
                          (three classes).



                            Figure 4.37 shows the listing produced by SPSS in a dynamic search performed
                          on the cork stoppers data (three classes), using the squared Bhattacharyya distance
   133   134   135   136   137   138   139   140   141   142   143