Page 249 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 249

230      6 Statistical Classification


           linking the means. The only difference from the results of the previous section is
           that the hyperplanes separating class ω i from class ω j are now orthogonal to the
                  -1
           vector Σ (m i − m j).
              In practice, it is impossible to guarantee that all class covariance matrices are
           equal. Fortunately, the decision surfaces  are usually not very sensitive to mild
           deviations from this condition; therefore, in normal practice, one uses an estimate
           of a pooled covariance matrix, computed as an average of the sample covariance
           matrices. This is the practice followed by SPSS and STATISTICA.


           Example 6.3
           Q: Redo Example 6.1, using a minimum Mahalanobis distance classifier. Check
           the computation of the discriminant parameters and determine to which class a
           cork with 65 defects is assigned.

           A: Given the similarity of both distributions, the Mahalanobis classifier produces
           the same classification results as the Euclidian classifier. Table 6.1 shows the
           classification matrix (obtained with SPSS) with the predicted classifications along
           the columns and the true (observed) classifications along the rows. We see that for
           this simple classifier, the  overall percentage of correct classification in the  data
           sample (training set) is 77%, or equivalently, the overall training set error is 23%
           (18% for  ω 1 and  28% for  ω 2). For the  moment, we  will not assess how the
           classifier performs with independent cases, i.e., we will not assess its test set error.
              The  decision  function coefficients (also known as  Fisher’s coefficients), as
           computed by SPSS, are shown in Table 6.2.


           Table 6.1.  Classification matrix obtained with SPSS  of two classes of cork
           stoppers using only one feature, N.
                                        Predicted Group Membership    Total
                               Class          1            2
           Original Count        1           41            9           50
           Group                 2           14            36          50
                    %            1          82.0          18.0        100
                                 2          28.0          72.0        100
           77.0% of original grouped cases correctly classified.


           Table 6.2. Decision function coefficients obtained with SPSS for two classes of
           cork stoppers and one feature, N.
                                  Class 1                      Class 2
           N                       0.192                         0.277
           (Constant)            −6.005                      −11.746
   244   245   246   247   248   249   250   251   252   253   254