Page 157 - Introduction to Statistical Pattern Recognition
P. 157

4  Parametric Classifiers                                     139



                    Procedure I1 to find s (the resubstitution method):

                                                      A                            1
                           (1)  Compute the sample mean, Mi, and sample covariance matrix, Zj.
                           (2)  Calculate V for a given s by V = [siI + (I-S)%J’($~-$~).

                           (3)  Using  the  V  obtained,  compute  yy) = VTXf)  (i = 1,2;
                               j  = 1, . . . ,N), where Xy’ is the jth oj-sample.
                           (4)  The  y;”  and  Y)~”S, which  do  not  satisfy  y;”  c -v,   and
                               yj (2) > -v,,  are counted as errors.  Changing v,  from -- to  -tm,
                               find the v,  which gives the smallest error.
                           (5)  Change s from 0 to 1, and plot the error vs. s.

                    Note  that  in  this  process  no  assumption is  made  on  the  distributions of  X.
                    Also, the criterion function, fi is never set up.  Instead of  using an equation for
                    ft  the  empirical error-count  is  used.  The  procedure is  based  solely  on  our
                    knowledge   that   the   optimum   V   must   have   the   form   of
                    [SEI + (l-s)x*]-yM*-M,).
                         In  order to confirm the  validity  of  the  procedure, the  following experi-
                    ment was conducted.

                         Experiment 1: Finding the optimum s (Procedure 11)
                               Data:  I-A (Normal, n  = 8, E = 1.9%)
                               Sample size:  N  = N2 = 50, 200
                                            I
                               No. of trials:  z = 10
                               Results:  Fig. 4-8


                    Samples were generated and the error was counted according to Procedure 11.
                    The averaged error over  10 trials vs.  s is plotted in  Fig.  4-8.  Note  that  the
                    error of Procedure I1 is smaller than the error of Procedure I.  This is due to the
                    fact that the same samples are used for both designing and testing the classifier.
                    This method of using available samples is called the resubstitution merhod, and
                    produces an optimistic bias.  The bias is reduced by  increasing the sample size
                    as seen in Fig. 4-8.  In order to avoid this bias, we need to assure independence
                    between design and test samples, as follows:
   152   153   154   155   156   157   158   159   160   161   162