Page 246 - Introduction to Statistical Pattern Recognition
P. 246

228                        Introduction to Statistical Pattern Recognition



                                      1
                                                 n-1
                                                       Ni-1
                       -_     1    + - ln-   Ni   + -        >O                 (5.134)
                         1
                       -
                                                     In-
                         2  Ni(N;-l)   2   Ni-1   2    Ni-2
                     for Ni > 2.  The inequality of  (5.134) holds since the numerators of the second
                     and third terms are larger than the corresponding denominators.
                          Comments:  Since  g  of  (5.130)  is  always  positive,  hL > hR  for  01-
                                A  ,  .
                     samples and  hL < hR for  y-samples.  Since h  > 0 is  the  condition for ol-
                     samples to be  misclassified and  h  < 0 is the  condition for 02-samples to be
                     misclassified, the  L method  always  gives  a  larger error  than  the  R  method.
                     This is true for any test distributions, and  is not  necessarily limited to  normal
                     distributions.  Note  that  this  conclusion is  a  stronger statement than  the  ine-
                     quality of  (5.116), because the inequality of (5.1 16) holds only for the expecta-
                     tion of  errors, while the above statement is for individual samples of individual
                     tests.
                          Since we have the exact perturbation equation of (5.130), the use of this
                     equation  is  recommended to  conduct  the  R  and  L  methods.  However, for
                     further theoretical analysis, (5.130) is  a little too complex.  An  approximation
                                                              -2
                     of  (5.130) may  be  obtained by  assuming N;  >> dj  and Ni >>  1.  When X is
                     distributed normally, it is know that d2(X) has the chi-square distribution with
                     an  expected  value  of  n  and  standard  deviation  of  6, where
                     d2(X) = (X-M)TC-’(X-M)  [see  (3.59)-(3.61)].  Therefore,  if  N  >>  n, the
                     approximation based on N  >> d2 is justified.  Also, In( 1+6) Z6 for a small 6 is
                     used  to  approximate  the  second  and  third  terms  of  (5.130).  The  resulting
                     approximation is
                                            -2         1   -4
                                       g (Ni,di (Xf))) E -[d;   (Xf’)+n] .      (5.135)
                                                     2Ni
                          In order to confirm that the L and R  methods give the upper and lower
                     bounds of the Bayes error, the following experiment was conducted.

                          Experiment 7: Error of  the quadratic classifier, L and R
                                Data: I-A (Normal, n = 8, E = 1.9%)
                                Classifier: Quadratic classifier of  (5.54)
                                Sample size: N1 = N2 = 12, 50, 100, 200, 400
                                No. of trials: z = 40
                                Results: Table 5-9 [ 141
                     As expected, the L and R methods bound the Bayes error.
   241   242   243   244   245   246   247   248   249   250   251