Page 291 - Introduction to Statistical Pattern Recognition
P. 291
6 Nonparametric Density Estimation 273
where N >> k >> 1 is assumed. Therefore, the variance and mean-square error
of &x) are
(6.93)
(6.94)
Again, in (6.94) the first and second terms are the variance and the squared
bias respectively. It must be pointed out that the series of approximations used
to obtain (6.91)-(6.94) is valid only for large k. For small k, different and more
A A
complex approximations for E { p(X)) and Var( p(X)] must be derived by using
(6.87) and (6.88) rather than (6.90). As in the Parzen case, the second order
approximation for the bias and the first order approximation for the variance
may be used for simplicity. Also, note that the MSE of (6.94) becomes zero as
k+-= and klN+O. These are the conditions for the kNN density estimate to be
asymptotically unbiased and consistent [ 141.
Optimal Number of Neighbors
Optimal k: In order to apply the kNN density estimate of (6.68), we
need to know what value to select for k. The optimal k under the approxima-
tion of 14 =PI’ is m, by minimizing (6.82) with respect to k. That is, when
L(X) is small and u =PI’ holds, the variance dominates the MSE and can be
reduced by selecting larger k or larger L(X). As L(X) becomes larger, the
second order term produces the bias and the bias increases with L(X). The
optimal k is determined by the rate of the variance decrease and the rate of bias
increase.