Page 291 - Computational Statistics Handbook with MATLAB

P. 291

280 Computational Statistics Handbook with MATLAB

constructed using shifts along the coordinates given by multiples of
,
,
δ i m i i, = 1 … d. Scott [1992] provides a detailed algorithm for the bivari-
⁄
ate ASH.

8.3 Kernel Density Estimation
Scott [1992] shows that as the number of histograms m approaches infinity,
the ASH becomes a kernel estimate of the probability density function. The
first published paper describing nonparametric probability density estima-
tion was by Rosenblatt [1956], where he described the general kernel estima-
tor. Many papers that expanded the theory followed soon after. A partial list
includes Parzen [1962], Cencov [1962] and Cacoullos [1966]. Several refer-
ences providing surveys and summaries of nonparametric density estima-
tion are provided in Section 8.7. The following treatment of kernel density
estimation follows that of Silverman [1986] and Scott [1992].

U
Un ni iv var ar i ia at te eK Ke er rn ne el lE Es st ti i mator mator s s
n
E
e
e
n
K
U
v
e
U n ii v ar ar ii a a tt e K e r r n e ll E s s tt ii mator mator s s
The kernel estimator is given by
n
--------------
ˆ 1  x – X i
f Ker x() = ------ ∑ K  h  , (8.26)
nh
i = 1
where the function Kt() is called a kernel. This must satisfy the condition that
∫ Kt() t = 1 to ensure that our estimate in Equation 8.26 is a bona fide density
d
(
⁄
⁄
estimate. If we define K h t() = Kt h) h , then we can also write the kernel
estimate as
n
ˆ 1
f Ker x() = --- ∑ K h x –( X i . ) (8.27)
n
i = 1
Usually, the kernel is a symmetric probability density function, and often a
standard normal density is used. However, this does not have to be the case,
and we will present other choices later in this chapter. From the definition of
ˆ
a kernel density estimate, we see that our estimate f Ker x() inherits all of the
properties of the kernel function, such as continuity and differentiability..
From Equation 8.26, the estimated probability density function is obtained
by placing a weighted kernel function, centered at each data point and then
taking the average of them. See Figure 8.8 for an illustration of this procedure.
© 2002 by Chapman & Hall/CRC

286 287 288 289 290 291 292 293 294 295 296