Page 339 - Introduction to Statistical Pattern Recognition
P. 339

7  Nonparametric Classification and Error Estimation          32  1
                                                 ,.
                     (2)  Plot these t empirical points  ENN  vs. PI. Then, find the line best fitted to
                          these 1 points.  The slope of  this line  is Ex { .)  and the y-intercept  is &iN,

                          which we would like to estimate.
                                              ,.                                   ,.
                     The reader must be aware that eNN varies widely  (the standard deviation of  ENN
                                                                           ,.
                     for each P,  is large), and that PI = 0 is far away from the  P-region  where the
                     actual experiments are conducted.  Therefore, a small variation  in   tends  to
                     be amplified and causes a large shift in the estimate of E,&.



                     Biases for Other Cases

                          2NN: The bias of  the 2NN error can be obtained from (7.13) in a similar
                     fashion, resulting in [8]

                                                                         ,
                                 E(&,N)  K&;NN  + P2Ex([ IA I-1’”tr{AB2(X)}]2}   (7.38)
                     where
                                                            1
                                                          +
                         B2(X) =P-*’“(X)[VP(X)VTqI(X)p-I(X) ,v2qI(x)1  1         (7.39)









                                                  r( 1+4/n)N-4’f7 .              (7.40)
                                             1 +2/n


                     By comparing (7.40)  with  (7.37), it can be  seen that P2 is roughly proportional
                     to N41r1 while PI is  proportional  to N-2”1.  That  is,  as N  increases,  the  2NN
                     error converges  to  its asymptotic  value  more  quickly  than  the NN  error -  as if
                     the  dimensionality,  n, were  half  as  large.  Also,  note  that  P2 is  significantly
                     smaller  than  PI, because  r2’”/nx (.088  for  n  = X  and  .068  for  n  = 32)  is
                     squared.  Many  experiments  also  have  revealed  that  the  2NN  error  is  less
                     biased  than  the  NN  error  [I 11.  Since  their  asymptotic  errors  are  related  by
                                  ;
                                        ~
                     E;,,,   = 2  ~  from   ~ (7.1 1) and (7.14), a better estimate  of &LN could  be obtained
                     by estimating &ZNN  first and doubling  it.
   334   335   336   337   338   339   340   341   342   343   344