Page 239 - Introduction to Statistical Pattern Recognition
P. 239

5  Parameter Estimation                                       22 1



                   mate of  the error.  Since each test sample is excluded from the design sample
                   set, the independence between the design and test sets is maintained.  Also, all
                   N samples are tested and N-1  samples are used for design.  Thus, the available
                   samples are, in this method, more effectively utilized.  Furthermore, we do not
                   need  to  worry  about  dissimilarity between  the  design  and  test  distributions.
                   One  of  the  disadvantages of  the  L  method  is  that  N classifiers  must  be
                   designed, one  classifier for  testing  each  sample.  However,  this  problem  is
                   easily overcome by a procedure which will be discussed later.
                        The H and L methods are supposed to give very similar, if  not identical,
                   estimates  of  the  classification  error,  and both  provide  upper  bounds  of  the
                   Bayes error.  In  order to confirm this, an experiment was conducted as follows.
                        Experiment 6: The H and L errors
                              Data: 1-1 (Normal, MTM = 2.562,  E = 10%)
                              Dimensionality: n  = 4, 8,  16, 32, 64
                              Classifier: Quadratic classifier of (5.54)
                              Sample size:  n, n2 = kn (Design)
                                             =
                                          N I  = N2 = kn  (Test)   for
                                          NI =N~=knforL
                                          k = 3, 5, 10, 20, 40
                              No. of trials: T= 10
                              Results: Table 5-8

                   The first and  second lines of  Table 5-8  show the average and  standard devia-
                   tion of  the H error estimate, while the third and fourth lines are the average and
                   standard deviation of the L error estimate.  Both results are very close.

                        Operation  of  the  L  method:  In  order to  illustrate how  the  L method
                   works, let us examine the simplest case in which two covariances are equal and
                   known as 1.  Then, the Bayes classifier is
                                                                 01
                                  (x-M,)~(x-M~) - (X-M,)~(X-M,) ><  r  .      (5.1 17)
                                                                 w2
                   Assume that two sample sets, SI = (XI'), . . . ,Xg! }  from  o1 and S2 = {X\2),
                    . . . ,X$:  }  from  02,  are given.  In  the R  method, all of these samples are used
                   to design the classifier and also to test the classifier.  With the given mathemat-
   234   235   236   237   238   239   240   241   242   243   244