Page 255 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 255

236      6 Statistical Classification
























           Figure 6.8. Histograms of feature N for two classes of cork stoppers, obtained with
           STATISTICA. The threshold value N = 65 is marked with a vertical line.


                                                               3
              From this graphic  display,  we can estimate the likelihoods  and the posterior
           probabilities:

              p (x |ω 1 ) =  20  / 24 =  . 0  833 ⇒  P (ω 1 ) p (x |ω 1 ) =  4 . 0 ×  . 0  833 =  . 0  333 ;  6.18a
              p (x |ω 2 ) =  16 / 23 =  . 0  696 ⇒  P (ω 2  ) p (x |ω 2 ) =  6 . 0 ×  . 0 696 =  . 0  418. 6.18b

              We then decide class ω 2, although the likelihood of ω  1  is bigger than that of ω 2 .
           Notice how the statistical model prevalences changed the conclusions derived by
           the minimum distance classification (see Example 6.3).


              Figure 6.9 illustrates the effect of adjusting the prevalence threshold assuming
           equal and normal pdfs:

              •   Equal prevalences.  With equal  pdfs, the decision threshold is at  half
                 distance from  the  means. The number  of cases incorrectly classified,
                 proportional to the shaded areas, is equal for both classes. This situation is
                 identical to the minimum distance classifier.
              •   Prevalence of ω 1 bigger than that of ω 2. The decision threshold is displaced
                 towards the class with smaller prevalence, therefore decreasing the number
                 of wrongly classified cases of the class with higher prevalence, as seems
                 convenient.

           3
               The normal curve fitted by STATISTICA is multiplied by the factor “number of cases” ×
             “ histogram interval width”, which is 1000 in the present case. This constant factor is of no
             importance and is neglected in the computations of 6.18.
   250   251   252   253   254   255   256   257   258   259   260