Page 335 - Computational Statistics Handbook with MATLAB
P. 335

324                        Computational Statistics Handbook with MATLAB


                             BAYES DECISION RULE:
                                                                 if
                             Given a feature vector x, assign it to class  ω j
                                            P ω j x) >  P ω i x);  i =  1 … Ji ≠  . j       (9.6)
                                              (
                                                      (
                                                                      ,
                                                                         ,
                                                                           ;
                             This states that we will classify an observation x as belonging to the class that
                             has the highest posterior probability. It is known [Duda and Hart, 1973] that
                             the decision rule given by Equation 9.6 yields a classifier with the minimum
                             probability of error.
                              We can use an equivalent rule by recognizing that the denominator of the
                             posterior probability (see Equation 9.2) is simply a normalization factor and
                             is the same for all classes. So, we can use the following alternative decision
                             rule:

                                         (
                                                                           ,
                                                                              ,
                                       P x ω j )P ω j ) >  P x ω i )P ω i );  i =  1 … J;  i ≠  . j  (9.7)
                                                             (
                                                      (
                                               (
                             Equation 9.7 is Bayes Decision Rule in terms of the class-conditional and
                             prior probabilities. If we have equal priors for each class, then our decision is
                             based only on the class-conditional probabilities. In this case, the decision
                                                                                ,  ,  ,
                             rule partitions the feature space into J decision regions Ω 1 Ω 2 …Ω J  . If x is
                                                                          .
                             in region Ω j  , then we will say it belongs to class ω j
                              We now turn our attention to the error we have in our classifier when we
                             use Bayes Decision Rule. An error is made when we classify an observation
                                      when it is really in the j-th class. We denote the complement of
                             as class  ω i
                                          c
                             region Ω i   as Ω i  , which represents every region except Ω i  . To get the proba-
                             bility of error, we calculate the following integral over all values of x [Duda
                             and Hart, 1973; Webb, 1999]
                                                           J
                                                                (
                                                                       (
                                                 (
                                                P error) =  ∑  c ∫  P x ω i )P ω i )d  . x  (9.8)
                                                             Ω
                                                         i =  1  i
                             Thus, to find the probability of making an error (i.e., assigning the wrong
                             class to an observation), we find the probability of error for each class and
                             add the probabilities together. In the following example, we make this clearer
                             by looking at a two class case and calculating the probability of error.

                             Example 9.3
                             We will look at a univariate classification problem with equal priors and two
                             classes. The class-conditionals are given by the normal distributions as fol-
                             lows:





                            © 2002 by Chapman & Hall/CRC
   330   331   332   333   334   335   336   337   338   339   340