Page 258 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 258

6.3 Bayesian Classification   239


              Let us assume, first, that wrong decisions imply the same loss, which can be
           scaled to a unitary loss:

                            0  if  i  = j
              λ ij  = (αλ  i  |ω j )  =    .                              6.22a
                             1  if  i  ≠ j

              In this situation, since all posterior probabilities add  up to one,  we  have to
           minimise:

              R(α i  |  ) x  =  ∑ P(ω j  |  ) x  =1 − P(ω i  |  ) x .     6.22b
                       j ≠i

              This corresponds to maximising  P(ω i |  x), i.e., the Bayes decision rule for
           minimum risk corresponds to the generalised version of 6.15a:

              Decide   ω i  if  P(ω i  |  ) x  > P(ω j  | x ),  ∀ j ≠  i .  6.22c

              Thus, the decision function for class  ω i  is  the  posterior  probability,
                  P
            g i  (x ) = (ω i  |  ) x , and the classification rule amounts to selecting the class with
           maximum posterior probability.
              Let us now consider the situation of different losses  for  wrong decisions,
           assuming, for the sake of simplicity, that c = 2. Taking into account expressions
           6.20a and 6.20b, it is readily concluded that we will decide ω 1 if:

                        >
              λ P (ω 1  | x ) λ P (ω 2  |  ) x ⇒  ( p x |ω 1 )λ P (ω 1 ) >  ( p x |ω 2 )λ P (ω 2  ) .   6.23
               21
                                                21
                           12
                                                                 12

              This is equivalent to formula 6.17 using the following adjusted prevalences:

                          λ P (ω )                    λ P (ω )
              P * (ω 1 ) =  21  1      ; P * (ω 2 ) =  12   2     .       6.23a
                                                         +
                             +
                      λ P (ω 1 ) λ P (ω 2 )      λ P (ω 1 ) λ P (ω 2 )
                       21
                                12
                                                           12
                                                  21

              STATISTICA and SPSS allow specifying the priors as estimates of the sample
           composition (as in 6.14) or by user assignment of specific values. In the latter the
           user can adjust the priors in order to cope with specific classification risks.

           Example 6.7
           Q: Redo Example 6.6 using adjusted prevalences that take into account 6.14 and
           the loss matrix 6.19. Compare the classification risks with and without prevalence
           adjustment.
           A: The losses are  λ 12  = 0.015 and λ 21  = 0.01. Using the prevalences  6.14, one
                                     *
                   *
           obtains  P (ω 1) = 0.308 and  P (ω 2 ) = 0.692. The higher loss associated with a
                                                                 *
           wrong classification of a ω 2  cork stopper leads to an increase of P ( ω 2 ) compared
                 *
           with P ( ω 1). The consequence of this adjustment is the decrease of the number of
   253   254   255   256   257   258   259   260   261   262   263