Page 254 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 254

6.3 Bayesian Classification   235


              Note that the prevalences are not entirely controlled by the factory, and that they
           depend mainly on the quality of the raw material. Just as, likewise, a cardiologist
           cannot control how prevalent  myocardial infarction is in a given population.
           Prevalences can, therefore, be regarded as “states of nature”.
              Suppose we are asked to make a blind decision as to which class a cork stopper
           belongs without looking at it. If the only available information is the prevalences,
           the sensible choice is class ω 2. This way, we expect to be wrong only 40% of the
           times.
              Assume now  that we were  allowed to measure the feature  vector  x of  the
           presented cork stopper. Let (P ω i  |  ) x be the conditional probability of the cork
           stopper represented by  x  belonging to class  ω i. If  we are able to  determine the
                                 P
           probabilities  (ωP  1  |  ) x and (ω 2  |  ) x , the sensible decision is now:

                           P
              If    (ωP  1  |  ) x  >  (ω 2  |  ) x  we decide x ∈ ω ;
                                                    1
                           P
              If    (ωP  1  |  ) x  <  (ω 2  |  ) x  we decide x ∈ ω ;     6.15
                                                    2
                           P
              If    (ωP  1  |  ) x  =  (ω 2  |  ) x    the decision is arbitrary.

              We can condense 6.15 as:

              If    (ωP  1  |  ) x  >  (ω 2  |  ) x   then   ∈ ω    else   ∈ ω .  6.15a
                                         x
                           P
                                                    x
                                                         2
                                             1

              The posterior probabilities  (P ω i  |  ) x  can be computed if we know the pdfs of
           the distributions  of the feature  vectors in both classes,  p (x |ω i  ) , the so-called
           likelihood of x. As a matter of fact, the Bayes law (see Appendix A) states that:

                        p (x |ω )P (ω )
               ( P ω i  |  ) x =  i  i  ,                                  6.16
                            p (x )
           with   p )(x  = ∑ c = i 1 p( ω i  ) P(ω i ) ,  the  total probability of x.
                              |
                            x
              Note that P(ω i) and P(ω i  | x) are discrete probabilities (symbolised by a capital
           letter), whereas p(x |ω i) and p(x) are values of pdf functions. Note also that the
           term p(x) is a common term in the comparison expressed by 6.15a, therefore, we
           may rewrite for two classes:

                                  x
                                   |
                                                  x
                                                            x
              If    ( ωp x  |  1 )P (ω 1 )  >  ( ω 2  )P (ω 2 )  then  ∈ ω  else  ∈ ω ,  6.17
                                p
                                                      1
                                                                2

           Example 6.5
           Q: Consider the classification of cork stoppers based on the number of defects, N,
           and restricted to the  first  two  classes,  “Super” and “Average”.  Estimate the
           posterior probabilities and classification of a cork stopper with 65 defects, using
           prevalences 6.14.
           A: The feature vector is x = [N], and we seek the classification of x = [65]. Figure
           6.8 shows the histograms of both classes with a superimposed normal curve.
   249   250   251   252   253   254   255   256   257   258   259