Page 337 - Introduction to Statistical Pattern Recognition
P. 337

7  Nonparametric Classification and Error Estimation         319


                     (6.54) shows that an  expression of  the form  I A  I-””tr{AB I  }  is minimized by
                     setting A  = BY‘, provided B I  is a positive definite matrix.  However, B  might
                     not be positive definite, because of the term  [9*-9’] in  (7.36).  Thus, it is not
                     immediately clear how to choose A  to minimize the bias.  Nevertheless, selec-
                     tion  of  an appropriate metric remains an important topic in NN error estimation
                     [9-lo].

                     Experimental Verification

                          In order to verify the results mentioned above, the following experiment
                     was run:

                          Experiment 3: Voting NN error estimation,
                                         L method (Table 7-l(a))
                               Data:  I-I  (Normal, n = 8)
                                     M adjusted to give E* = 2, 5, 10, 20, 30(%)
                                Sample size: N1 = N2 = 20n, 40n, 80n, 160n
                                No. of  trials: z = 20
                                Metric: A  = I  (Euclidean)
                                Results: Fig. 7-6 [8]
                     In  Fig.  7-6,  the  small circle  indicates  the  average  of  the NN  errors  over  20
                     trials, and  the  vertical bar represents f one  standard deviation.  According to
                     (7.33, the bias of  the NN error varies linearly with PI for any given set of dis-
                     tributions.  Therefore, if  we  know &LN and  Ex(.}, we  can  predict  the  finite
                     sample NN errors as linear functions of PI.  The dotted lines of  Fig. 7-6 show
                     these predicted NN errors for various values of the Bayes error.  The Ex { .}’s of
                     (7.35) are tabulated in  Table 7-2.  The theoretical asymptotic error,  E;,,,,   was
                     estimated by  generating a  large  member  (160011) of  samples, calculating  the
                     risk  at  each  sample point  from  (7.1 1) using  the  known mathematical expres-
                     sions for si(X) in the normal case, and averaging the result.  Note that the aver-
                                        A
                     ages of  these measured E,,,~’s are reasonably close to the predicted values.
                          While it may not be  practical to obtain the asymptotic NN errors simply
                     by  increasing the sample size, it may be possible to use information concerning
                     how  the bias changes with  sample size to  our  advantage.  We  could measure
                     eNN empirically for several sample sizes, and obtain PI using  either  (7.37) or
                     Fig. 7-5.  These values could be  used  in  conjunction with  (7.35) to  obtain an
                     estimate of the asymptotic NN error as follows:
   332   333   334   335   336   337   338   339   340   341   342