Page 273 - Introduction to Statistical Pattern Recognition

P. 273

6 Nonparametric Density Estimation 255

chapter is to make the reader familiar with the fundamental mathematical pro-
perties related to nonparametric density estimation in preparation for the
material presented in Chapter 7.

6.1 Parzen Density Estimate
Parzen Density Estimate

In order to estimate the value of a density function at a point X, we may
set up a small local region around X, L (X). Then, the probability coverage (or
probability mass) of L(X) may be approximated by p(X)v where v is the
volume of L(X). This probability may be estimated by drawing a large
number of samples, N, from p(X), counting the number of samples, k, falling
in L (X), and computing k/N. Equating these two probabilities, we may obtain
an estimate of the density function as
n k(X) n k(X)
or p(X)= -
p(X)v = - .
N Nv
Note that, with a fixed v, k is a random variable and is dependent on X. A
fixed v does not imply the same v throughout the entire space, and vv could still
vary with X. However, v is a preset value and is not a random variable.

Kernel expression: The estimate of (6.1) has another interpretation.
Suppose that 3 samples, X3, X,, and X,, are found in L(X) as shown in Fig.
6-1. With I' and N given, i(X) becomes 3/Nv. On the other hand, if we set up
a uniform kernel function, IC(.), with volume v and height l/v around all exist-
ing samples, the average of the values of these kernel functions at X is also
3/Nv. That is, [ 1-41

n l N
p(x) = - K(X - x;) . (6.2)
N ;=I
As seen in Fig. 6-1, only the kernel functions around the 3 samples,
X3, X4, and X,, contribute to the summation of (6.2).
Once (6.2) is adopted, the shape of the kernel function could be selected
more freely, under the condition K(X) dX = 1. For one-dimensional cases,
we may seek optimality and select a complex shape. However, in a high-
dimensional space, because of its complexity, the practical selection of the ker-

268 269 270 271 272 273 274 275 276 277 278