Page 277 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 277

258      6 Statistical Classification


              Statistical software  products such as  SPSS and  STATISTICA allow the
           selection of the cases used for training and  for testing linear discriminant
           classifiers. With SPSS, it is possible to use a selection variable, easing the task of
           specifying randomly selected samples. SPSS also affords performing a leave-one-
           out classification. With STATISTICA, one can initially select the cases used for
           training (Selection Conditions   option in the Too ls   menu), and once the
           classifier is designed, specify test cases (Select Cases   button in the
           Classification    tab of the command window). In MATLAB and R one may
           create a case-selecting vector, called a filter, with random 0s and 1s.

           Example 6.14

           Q: Consider the two-class cork-stopper classifier, with two features, presented in
           section 6.2.2 (see classification matrix in Table 6.3). Evaluate the performance of
           this classifier using the partition method with k = 3, and the leave-one-out method.
           A: Using the partition method with k = 3, a test set estimate of Pe t  = 9.9 % was
           obtained, which is near the training set error estimate of 10%. The leave-one-out
           method also produces  Pe t  = 10 % (see Table 6.11; the  “Original” matrix is the
           training set estimate, the “Cross-validated” matrix is the test set estimate). The
           closeness of these figures is an indication of reliable error estimation for this high
           dimensionality ratio classification problem (n/d = 25). Using formula 6.28 the 95%
           confidence limits for these error estimates are: s = 0.03 ⇒ Pe = 10% ± 5.9%.


           Table 6.11. Listing of the classification matrices obtained with SPSS, using the
           leave-one-out method in the classification of the first two classes  of the cork-
           stopper data with two features.
                                            Predicted  Group  Membership  Total
                                     C          1             2
           Original       Count      1          49            1          50
                                     2          9            41          50
                          %          1         98.0          2.0        100
                                     2         18.0          82.0       100
           Cross-validated Count     1          49            1          50
                                     2          9            41          50
                          %          1         98.0          2.0        100
                                     2         18.0          82.0       100

           Example 6.15
           Q: Consider the three-class, cork-stopper classifier, with four features, determined
           in Example 6.13. Evaluate the performance of this classifier using the leave-one-
           out method.
   272   273   274   275   276   277   278   279   280   281   282