Page 137 -
P. 137

124    4 Statistical Classification

                        sequential forward  floating  search  uses  1  and  r  determined  automatically  and
                         updated  dynamically.  It  was found  to perform almost as well as the  branch and
                         bound method with much less computational effort (see Jain and Zongker,  1997).




                         Stepwise Analysis -  Step 0
                         Number of variables in the model:  0
                         Wilks  '  Lambda: 1.000000
                         Stepwise Analysis - Step 1
                         Number of variables in the model:  1
                         Last variable entered:    ART   F [  1,   99) =  136.5565   p <  .0000
                         Wilks' Lambda:  .4178098  approx. F (   1,  98) =  136.5565  p <  ,0000
                         Stepwise Analysis -  Step 2
                         'Number of variables in the model: 2
                         Last variable entered:    PRM   F (  1,  98)  =  3.880044  p <  .0517
                         Wilks' Lambda: .4017400  approx. F (  2,  97)  =  72.22485   p <  .0000
                         .Stepwise Analysis - Step 3
                         .Number of variables in the model: 3
                         Last variable entered:     NG  F (   1,  97)  =  2.561449   p <  .1128
                         Wilks' Lambda:  .3912994  approx. F (  3,   96)  =  49.77880   p <  .DO00
                         Stepwise Analysis -  Step 4
                         Number of variables in the model: 4
                         Last variable entered:   RAAR   F (   1,   96) =  1.619636  p <  .2062
                         Wilks' Lambda: ,3847401  approx. F (   4,  95) =  37.97999  p <  .OD00
                          Stepwise Analysis - Step 4 (Final Step)
                         Number of variables in the model: 4
                          Last variable entered:   RAAR   F (  1,   95) =  ,3201987  p <  .5728

                         Figure 4.36.  Feature selection using a forward search for two classes of the cork
                         stoppers data.





                           Direct sequential search methods can be applied using Statisticu and SPSS, the
                         latter affording a dynamic search procedure that is in fact a "plus 1-take away 1"
                         selection. As merit criterion. Statistics uses the Anova F (for all selected features at
                         a given step) with default value of one. SPSS allows the use of other merit criteria
                         such as the squared Bhattacharyya distance.
                           It is also common to set a lower limit to the so-called tolerance level, T=I-R~,
                         which must be satisfied by all features, where R is the multiple correlation factor of
                         one candidate feature with all the others. Highly  correlated features, which could
                         raise problems in  the computation of  the inverse covariance matrix, are therefore
   132   133   134   135   136   137   138   139   140   141   142