Page 371 - Introduction to Statistical Pattern Recognition

P. 371

7 Nonparametric Classification and Error Estimation 353

applied to the voting kNN approach too. If we adopt aiA (a I t u2) as the
metric to measure the distance to mi-neighbors in the voting kNN procedure,
we can control the decision threshold by adjusting the ratio of al and a2.
Furthermore, using aiAi (a I t a2, A # A2), we could make the voting kNN
virtually equivalent to the volumetric kNN. In this case, Ai could be Zj or a
more complex matrix like (7.65), and the ratio of al and a2 still determines
the threshold.

Data display: Equation (7.5) also suggests that we may plot data using
y I = nlnd, (X) and y2 = nlnd2(X) as the x- and y-axes, in the same way as we
selected y, = (X - Mj)*Z;'(X - Mi) for the parametric case in Fig. 4-9 [ 161. In
fact, nlndi(X) is the nonparametric version of the normalized distance
(X-Mj)TC;l (X-M,), as the following comparison shows:

1 1 n
-lnpj(x) = -(x-M~)*z;'(x-M;) + { -1n IC, I + -1n2n:)
2 2 2
for a normal distribution , (7.76)

-lnpj(X) = nlnd,(X) + Nico I Zj I
A

for the kNN density estimate , (7.77)

where co is the constant relating vi and d, as in (B.1). Note that
&X) = (kj-l)/Nico I C, "2dl is used in (7.77). Using two new variables,
I
y I = nlndl (X) and y2 = nlnd2(X), the Bayes classifier becomes a 45 'line as

nlnd2(X) Snlndl(X) - (7.78)

where 1. ) gives the y-cross point.
In order to show what the display of data looks like, the following exper-
iment was conducted.

Experiment 17: Display of data
Data: I-A (Normal, n = 8, E* = 1.9%)
Sample size: N I = N2 = 100 (L method)
No. of neighbors: kl = k2 = 5

366 367 368 369 370 371 372 373 374 375 376