Page 279 - Introduction to Statistical Pattern Recognition
P. 279

6  Nonparametric Density Estimation                           26  1


                                            r           1










                                                                                (6.29)

                    where  p 'I- klNv and N  >>k  are used.  This  suggests that  the  second term of
                    (6.19) is  much  smaller than  the  first term, and  can be  ignored.  Also,  (6.29)
                    indicates that k+w  is  required along with  N+-   for  the Parzen density  esti-
                    mate to be  consistent.  These are the known conditions for asymptotic unbias-
                    ness and consistency [2].

                         Convolution of normal distributions: If p (X)  is assumed to be normal
                    and  a  normal  kernel  is  selected for  K(X), (6.7)  and  (6.9)  become trivial  to
                    evaluate.  When two normal densities Nx(O,A) and Nx(O,B) are convolved, the
                    result is also a normal density of Nx(O,K), where



                                                                                (6.30)


                    In particular, if A  = C and B  = r2C
                                              K = (1 + r2)X                     (6.31)


                    Optimal Kernel Size

                         Mean-square error criterion: In order to apply the density estimate of
                    (6.1)  (or (6.2) with the kernel function of  (6.3)), we need to select a value for r
                    [5-111.  The optimal value of  r may  be determined by  minimizing the mean-
                                      ..
                    square error between p(X) and p (X) with respect to I-.
                                      MSE(P(X)J =El[P(X)-p(X)l21  .             (6.32)

                    This criterion is a function of X, and thus the optimal I' also must be  a function
                    of  X.  In  order  to  make  the  optimal  r  independent of  X,  we  may  use  the
                    integral mean-square error
   274   275   276   277   278   279   280   281   282   283   284