Page 375 - Introduction to Statistical Pattern Recognition
P. 375

7  Nonparametric Classification and Error Estimation         357


                     functions  using  design  samples  (design phase),  and  count  the  number  of
                     misclassified test samples (test phase).  As discussed in Chapter 5, with a finite
                     number of samples, the estimated error is biased due to the design samples and
                     has a variance due  to the  test  samples.  In  the grouped estimate, estimating  I’
                                                                                 A
                     corresponds  to  the  design  phase  and  computing  the  sample  mean  of  r(Xi)
                     corresponds to  the  test  phase.  The  performance  of  the  grouped estimate  has
                     not been fully studied in comparison with the error-counting result.  However,
                     if the risk function I’ (X) is given, the test phase of the grouped estimate has the
                     following advantages.

                          (1)  We  can  use  test  samples  without  knowing  their  true  class  assign-
                     ments.  This property could be useful, when a classifier is tested in the field.

                          (2) The variance due to  this test process  is  less  than  a half  of  the vari-
                     ance due to error-counting.
                          In  order  to  prove  that  the  second  property  is  true,  let  us  compute  the
                     variance of  (7.82)  with  the  assumption  that  I’ (X)  is  given  and  E { r(X) 1 = E.
                     Since the Xi’s are independent and identically distributed,


                                             1
                                   ~ar(L1  -~ar{r(~)l
                                          =
                                            N
                                             1
                                          =  -[E-E~-E(I’(X)[I  - I’(W1 I1
                                            N
                                            1         1
                                          5 -[E  - E*  - -E{r(X)  1
                                            N         2
                                             1
                                          =  -(E   - 2&2),                       (7.83)
                                             2N

                     where r(1-r)  2 1’/2 for 0 I r  50.5 is used to obtain the inequality.  Note from
                     (5.49) that  the  variance  of  error-counting  is  E(I-E)/N, which  is  larger  than
                     twice  (7.83).  When  the  design  phase  is  included,  we  must  add  the bias  and
                              A
                     variance of  r to evaluate the total performance of  the grouped estimate.  Also,
                     note that  the bias of   should be removed by  the threshold adjustment, as dis-
                     cussed  in  the  previous  section.  That  is,  r(X) must  be  estimated  by  solving
                     (7.80) or (7.81) for r(X) with the adjusted (‘1 term.
   370   371   372   373   374   375   376   377   378   379   380