Page 237 - Introduction to Statistical Pattern Recognition
P. 237

5  Parameter Estimation                                       219


                   5.3  Holdout, Leave-One-Out, and Resubstitution Methods

                   Bounds of the Bayes Error

                        When  a  finite  number  of  samples  is  given  and  the  performance  of  a
                   specified  classifier  is  to  be  estimated,  we  need  to  decide  which  samples  are
                   used  for  designing  the  classifier  and  which  samples  are  for  testing  the
                   classifier.

                        Upper  and  lower  bounds  of  the  Bayes  error:  In  general,  the
                   classification  error is  a  function  of  two  sets of  data, the  design  and  test  sets,
                   and may be expressed by



                   where, F  is a set of two densities as
                                          lF = (P I(X), P2(X)I  .             (5.1 10)
                   If  the classifier  is the  Bayes  for the  given  test  distributions, the resulting  error
                   is minimum.  Therefore, we have the following inequality
                                          E(PT,P;7) I E(P'D,PT)  .            (5.1 11)

                   The Bayes  error for the  true  I"  is E(P,P). However, we never know  the true
                                                      ,.
                   5). One  way  to overcome this  difficulty  is to find  upper and lower bounds of
                                            A   *
                   E(P,P) based on its estimate  @'  = (pI(X),p2(X)}. In order to accomplish this,
                   let us introduce from (5.1 11) two inequalities  as
                                           E(P,F) I     P) ,                  (5.1 12)
                                                    E(;,
                                                                              (5.1 13)

                                                                       1
                   Equation  (5.1 12) indicates  that  F  is the better design set than  0'  for testing  F.
                   Likewise, ; better  design  set than  P for testing ;.   Also, it  is  known
                              is
                                the
                   from (5.48) that, if an error counting procedure is adopted, the error estimate is
                   unbiased  with respect to test samples.  Therefore, the right-hand  side of (5.112)
                   can be modified to
                                          1            1..
                                        E( 5',  F) = E & ( E( 5',  PT) }  ,   (5.1 14)
                                                                    ,.
                         1
                   where PT is another set generated  from D  independently of  P. Also, after tak-
                   ing the expectation of (5.1 13),  the right-hand  side may be replaced  by
   232   233   234   235   236   237   238   239   240   241   242