Page 279 - Introduction to Statistical Pattern Recognition
P. 279
6 Nonparametric Density Estimation 26 1
r 1
(6.29)
where p 'I- klNv and N >>k are used. This suggests that the second term of
(6.19) is much smaller than the first term, and can be ignored. Also, (6.29)
indicates that k+w is required along with N+- for the Parzen density esti-
mate to be consistent. These are the known conditions for asymptotic unbias-
ness and consistency [2].
Convolution of normal distributions: If p (X) is assumed to be normal
and a normal kernel is selected for K(X), (6.7) and (6.9) become trivial to
evaluate. When two normal densities Nx(O,A) and Nx(O,B) are convolved, the
result is also a normal density of Nx(O,K), where
(6.30)
In particular, if A = C and B = r2C
K = (1 + r2)X (6.31)
Optimal Kernel Size
Mean-square error criterion: In order to apply the density estimate of
(6.1) (or (6.2) with the kernel function of (6.3)), we need to select a value for r
[5-111. The optimal value of r may be determined by minimizing the mean-
..
square error between p(X) and p (X) with respect to I-.
MSE(P(X)J =El[P(X)-p(X)l21 . (6.32)
This criterion is a function of X, and thus the optimal I' also must be a function
of X. In order to make the optimal r independent of X, we may use the
integral mean-square error