Page 279 - Introduction to Statistical Pattern Recognition

P. 279

6 Nonparametric Density Estimation 26 1

r 1

(6.29)

where p 'I- klNv and N >>k are used. This suggests that the second term of
(6.19) is much smaller than the first term, and can be ignored. Also, (6.29)
indicates that k+w is required along with N+- for the Parzen density esti-
mate to be consistent. These are the known conditions for asymptotic unbias-
ness and consistency [2].

Convolution of normal distributions: If p (X) is assumed to be normal
and a normal kernel is selected for K(X), (6.7) and (6.9) become trivial to
evaluate. When two normal densities Nx(O,A) and Nx(O,B) are convolved, the
result is also a normal density of Nx(O,K), where

(6.30)

In particular, if A = C and B = r2C
K = (1 + r2)X (6.31)

Optimal Kernel Size

Mean-square error criterion: In order to apply the density estimate of
(6.1) (or (6.2) with the kernel function of (6.3)), we need to select a value for r
[5-111. The optimal value of r may be determined by minimizing the mean-
..
square error between p(X) and p (X) with respect to I-.
MSE(P(X)J =El[P(X)-p(X)l21 . (6.32)

This criterion is a function of X, and thus the optimal I' also must be a function
of X. In order to make the optimal r independent of X, we may use the
integral mean-square error

274 275 276 277 278 279 280 281 282 283 284