Page 366 - Introduction to Statistical Pattern Recognition

P. 366

348 Introduction to Statistical Pattern Recognition

1 k k
E(A&} 3 1 - + b2 (-)un + b, (-)4'n , (7.74)
k N N
where the hi's are constant.

Several conclusions can be drawn from (7.74). In order to make
E(A&) 4 0, two conditions, k += and k/N + 0 as N +=. must be
satisfied. These conditions are identical to the conditions required for asymp-
totic consistency of the kNN density estimate [15]. Since klN Epivi, klN is
proportional to r", and therefore (klN)"" is proportional to r2. That is,
E{A&} Zb,/k + b; r2 + b; r4 where b; and b; are constants different from b2
and h3. We may compare this with (7.52), E { A&] of the Parzen approach. In
both cases, we have terms with r2 and r4 which are generated by the bias of
the density estimate. However, the bias of the kNN error does not have an
r-"/N term. This is due to the fact that (7.74) describes the behavior of the
kNN error only for large values of k. A series of approximations based on N
>> k >> 1 was applied to obtain E (&(X)) and MSE { &X)). If the analysis of
the kNN error for small values of k is needed, more complicated expressions
must be used. In addition to the r2 and r4 terms, the kNN bias has a constant
term bilk, which the Parzen bias does not have. This may be related to the
fact that the voting kNN error with a finite k does not converge to the Bayes
error, even under asymptotic conditions. This term can be reduced only by
increasing k.

Effect of parameters: In the Parzen approach, the most effective way to
reduce the bias of the error estimate is to adjust the threshold properly. Also,
in order to assure that the L method gives an upper bound of the Bayes error,
the kernel covariance matrix must be estimated either from a large number of
independent samples or by an L type estimation technique.
In terms of threshold selection, the method developed for the Parzen
approach may be directly applied to the kNN approach. The kNN density esti-
mates are known to be biased when the size of the design set is limited, and,
by choosing an appropriate threshold, one might hope to reduce or eliminate
the effect of that bias when classification is performed. There are no usable
expressions for t even in the normal case. However, each of the non-normal
methods for threshold selection (Options 2, 3, and 4) are directly applicable to
the kNN problem.

361 362 363 364 365 366 367 368 369 370 371