Page 289 - Introduction to Statistical Pattern Recognition
P. 289
6 Nonparametric Density Estimation 27 I
(6.81)
Equation (6.80) indicates that p = (k-I)/Nv is unbiased as long as u =pv
holds. If k/Nv is used instead, the estimate becomes biased. This is the reason
why (k-1) is used in (6.68) instead of k. The variance of i(X) also can be
computed under the approximation of u = pv as
(6.82)
Comparison of (6.29) and (6.82) shows that the variance of the kNN density
estimate is larger than the one for the Parzen density estimate. Also, (6.82)
indicates that, in the kNN density estimate, k must be selected larger than 2.
Otherwise, a large variance may result.
Second order approximation: When the second order approximation is
needed, (6.79) must be used to relate u and v. However, since r2 and 1' are
related by v = cr", it is difficult to solve (6.79) for v and a series of approxima-
A
tions is necessary. Since p = (k-l)/Nv, the computation of the first and second
order moments of i(X) requires E { v-' ] and E { v-~ 1. We start to derive v-l
from (6.79) as
v-l - =p [u-l + Lc1c-2,n 2/n -I
2 vu1
1
z p [u-1 + --a(cp)-2'""2"-'] , (6.83)
2
where the approximation of u =pv is applied to the second term to obtain the
second line from the first. Note that the second term was ignored in the first
order approximation and therefore is supposed to be much smaller than the first
term. Thus, using u =pv to approximate the second term is justified. From
(6.83)