Page 321 - Introduction to Statistical Pattern Recognition
P. 321
7 Nonparametric Classification and Error Estimation 303
HN Approach
Classifier: Using the kNN density estimate of Chapter 6, the likelihood
ratio classifier becomes
I
dz(xk:)N~.x) (kl-l)N2 lX2 112 0,
=-n In -In ><r, (7.5)
dI(Xil,)NN,X) (k2-1)NI IC, wz
where 11, =n”12r1(n/2+1)IC, l”2d:’ from (B.l), and df(Y,X) =
(Y-X)TC;l(Y-X). In order to classify a test sample X, the klth NN from oI
and the k2th NN from o2 are found, the distances from X to these neighbors
are measured, and these distances are inserted into (7.5) to test whether the
left-hand side is smaller or larger than t. In order to avoid unnecessary com-
plexity, k, = k2 is assumed in this chapter.
Error estimation: The classification error based on a given data set S
can be estimated by using the L and R methods. When Xi1) from o1 is tested
by the R method, Xi1) must be included as a member of the design set. There-
fore, when the kNN’s of Xi’) are found from the wI design set, Xi’’ itself is
included among these kNN’s. Figure 7-1 shows how the kNN’s are selected
and how the distances to the kth NN’s are measured for k = 2. Note in Fig. 7-1
that the locus of points equidistant from Xi!) becomes ellipsoidal because the
distance is normalized by E,. Also, since Cl # C2 in general, two different
ellipsoids are used for o, and 02. In the R method, Xi1) and Xi,(, are the
nearest and second nearest neighbors of Xi1) from o1 , while X,$, and X$& are
the nearest and second nearest neighbors of Xi1) from 02. Thus,
On the other hand, in the L method, Xi” is no longer considered a
member of the design set. Therefore, X$h and XgN are selected as the nearest
and second nearest neighbors of Xi’) from 0,. The selection of o2 neighbors
is the same as before. Thus,