Page 238 - Introduction to Statistical Pattern Recognition
P. 238

220                        Introduction to Statistical Pattern Recognition




                                             E{&(@,;)) = E(P,P).                 (5.115)
                       Thus, combining (5.1 12)-(5.115),

                                E{&(;,;)}  I &(P,P) I E&T{&(;,&)]                (5.1 16)
                       That  is,  the  Bayes error, e(@,@), is  bounded  by  two  sample-based estimates
                       WI.
                            The rightmost term      is obtained by  generating two  independent
                       sample sets, F and  I&,  from P, and using   for designing the Bayes classifier
                                                                             ,.
                           ,.
                       and PT for testing.  The expectation of  this error with  respect to  PT gives the
                       upper bound  of  the  Bayes error.  Furthermore, taking  the  expectation  of  this
                       result  with  respect  to   does  not  change  this  inequality.  Therefore,
                               A,.
                                     }
                       E >E ;T  { &(P, PT) also  can  be  used  as  the  upper  bound.  This  procedure  is
                       called  the  holdout  (H) method.  On  the  other  hand,  E($,;)  ,.   is  obtained  by
                            ,.
                       using  P for designing the  Bayes  classifier and  the  same  P for  testing.  The
                                                        A
                       expectation of  this error with respect to  0'  gives the lower bound of  the Bayes
                       error.  This procedure is called resubstitution (R) method.
                            The holdout method works well  if  the data sets are generated artificially
                       by  a computer.  However, in  practice, if  only  one  set of  data  is  available, in
                       order to apply the holdout method, we  need to divide the available sample set
                       into  two  independent  groups.  This  reduces  the  number of  samples available
                       for designing and testing.  Also, how  to divide  samples is  a  serious and  non-
                       trivial problem.  We must implement a proper dividing algorithm to assure that
                       the distributions of  design and test samples are very close.  In addition, how to
                       allocate  samples to  design  and  test  is  another  problem.  Since  we  know  the
                       effects of  design and test  sample sizes from  the discussion of  Section 5.2, we
                       may  determine how  samples should be  allocated to design and test by  balanc-
                       ing  the  resulting bias  and variance.  As  seen in  Section 5.2, the bias  is deter-
                       mined by  the size of  the design samples, while the variance is primarily deter-
                       mined by  the size of the test samples.
                            A procedure, called  the  leave-one-out (L) method,  alleviates  the  above
                       difficulties of the H  method [13].  In the L  method, one sample is excluded, the
                       classifier is designed on the remaining N-1  samples, and the excluded sample
                       is tested by  the classifier.  This operation is repeated N  times to test all N  sam-
                       ples.  Then, the number of  misclassified samples is counted to obtain the esti-
   233   234   235   236   237   238   239   240   241   242   243