Page 377 - Introduction to Statistical Pattern Recognition
P. 377

7  Nonparametric Classification and Error Estimation          359









                                                                                (7.85)

                    where  qi(X,,)  Eqj(X) is  used  for  the  asymptotic  analysis.  The  last  line  of
                    (7.85) is the expression of r(X) in terms of 4 I(X)q2(X).
                         It  is  easy to show that (7.85) is bounded by  the NN risk from the upper
                    side and the Bayes risk from the lower side in the range of 0 I 4 I q2 I 0.25  as





                    The proof  is  left for  the  reader.  Also,  (7.85)  can  be  generalized to  the  case
                    where the k I NN is used for removing samples and the k2NN for testing.  When
                    both kl and k2  are odd, the resulting error becomes larger than the Bayes error.
                    When k2 is even, the resulting error becomes smaller than the Bayes error.
                         Also, the edited NN procedure can be  applied repeatedly to  remove the
                    o,-samples in  the  o,-region.  The asymptotic analysis also can be  carried out
                    by  applying  (7.85)  repeatedly  [22].  Figure  7-16  shows  the  results  of  the
                    repeated edited NN procedure.



                    Reduction of Computation Time

                         One  of  the  severe drawbacks of  any  nonparametric  approach  is  that  it
                    requires a  large  amount  of  computation time.  For  the  kNN  approach, this  is
                    due to the fact that, in order to find  the kNN’s, we  must compute the distances
                    to  all  samples.  The  same  is  true  for  the  Parzen  approach.  Because  of  this
                    computational burden, nonparametric approaches are not popular as a classifier
                    operated on-line.  In  the off-line operation of  the  Bayes error estimation, each
                    sample  must  be  tested  by  computing  the  distances  to  other  samples.  This
                    means that  all possible  pairwise distances among  samples must  be  computed.
                    This becomes a burden to  a computer, slows down the  turn-around time,  and
                    limits  the  number  of  samples  we  can  use  even  when  more  samples  are
                    available.
   372   373   374   375   376   377   378   379   380   381   382