Page 347 - Introduction to Statistical Pattern Recognition
P. 347

7  Nonparametric Classification and Error Estimation          329


                         The  threshold for  normal  distributions:  However,  if  the p,(X)’s are
                    normal  distributions, the  effect  of  the  threshold  can  be  analyzed as  follows.
                    Recall from  (6.7) that the  expected value  of  &(X) in  the Parzen  density esti-
                    mate  is  given  by  p;(x)*~;(X). When  both  pj(X) and  K;(X) are  normal  with
                    covariance matrices  C;  and  r2Cj respectively, this  convolution  yields another
                    normal  density  with  mean  Mi and  covariance  (l+r2)Ci, as  shown  in  (6.31).
                                                        A
                    For  larger  values  of  r,  the  variance  of  p;(X) decreases,  and  the  estimate
                    approaches  its  expected  value.  Substituting  the  expected  values  into  the
                    estimated likelihood ratio, one obtains







                                                                                (7.53)

                    Except for the  1/( l+r2) factors on the inverse covariance matrices, this expres-
                    sion  is  identical to  the true likelihood ratio, -In  p  I (X)/p2(X). In  fact, the two
                    may be related by





                    The  true  Bayes decision  rule  is  given  by  - lnpI(X)/p2(X) ><1nPIlP2.  Using
                    (7.54), an equivalent test may be expressed in terms of the estimated densities:





                    where
                                         1     PI    1   /.?   IC,I
                                    t=-     (ln-)   + -(-)In-       .           (7.56)
                                        I+/.*   P2   2  I+/-,   IC,  I

                    In all  of  our experiments, we  assume PI = P2 = 0.5, so the first  term  of  (7.56)
                    may  be  neglected.  Equation (7.56) gives the appropriate threshold to use  when
                    the  Parzen  classifier  with  a  normal  kernel  function  is  used  on  normal  data.
                    This indicates that t can be  kept at zero if Cl  = XI, but t should be adjusted for
                    each  value  of  I’  if  Cl  # C2.  Otherwise,  the  classifier  based  on  the  Parzen
   342   343   344   345   346   347   348   349   350   351   352