Page 305 - Introduction to Statistical Pattern Recognition

P. 305

6 Nonparametric Density Estimation 287

to these neighbors are equal and the pairwise errors are added without mutual
interation. For large 0, E,, tends to saturate at 50% while E, does at 100%.
Thus, the above empirical equation does not hold.
When one class is surrounded by many other classes, we may design a
circular, one-class classifier. That is, X is classified to of if d(X,M,)c
d(Mf,MNN)/2 [see Fig. 6-51. Then, the error from oi, E,, is
E,. = I n P-’ e+”*dt (circular error) , (6.121)
cu
n +2
~(M,,M,,)/~o 2nQr(-)
2
where the integrand is the marginal density function of the distance from the
center and is derived from Nx(O,l). Note that the density function of the
squared-distance, 6, is given in (3.59) for Nx(O,l). Therefore, the inte rand of
(6.121) may be obtained from (3.59) by applying a transformatione = ? 5. The
E, computed from (6.121) is plotted (dotted lines) in Fig. 6-6. As is seen in
Figs. 6-5 and 6-6, the circular classifier is worse than the pairwise bisector
classifier.

6.3 Expansion by Basis Functions

Expansion of Density Functions

Basis functions: Another approach to approximating a density function
is to find an expansion in a set of husisfuncfions @;(X) as

(6.122)

If the basis functions satisfy
IK(X)$;(X)l);(x)dx = hj6;j , (6.123)

we say that the @;(X)’s are orthogonal with respect to the kernel K(X). The
term l)T(X) is the complex conjugate of l);(X), and equals $;(X) when @;(X) is a
real function. If the basis functions are orthogonal with respect to K(X), the
coefficients of (6.122) are computed by
h1c; = I.CX)p (X)@T(X)dX . (6.124)
When K(X) is a density function, (6.123) and (6.124) may be expressed by

300 301 302 303 304 305 306 307 308 309 310