Page 319 - Introduction to Statistical Pattern Recognition

P. 319

7 Nonparametric Classification and Error Estimation 301

A similar trend was observed in the parametric cases of Chapter 5, but the
trend is more severe with a nonparametric approach. These problems are
addressed extensively in this chapter.

Both Parzen and kNN approaches will be discussed. These two
approaches offer similar algorithms for classification and error estimation, and
give similar results. Also, the voting kNN procedure is included in this
chapter, because the procedure is very popular, although this approach is
slightly different from the kNN density estimation approach.

7.1 General Discussion

Parzen Approach

Classifier: As we discussed in Chapter 3, the likelihood ratio classfier
is given by -InpI(X)/p2(X) ><r, where the threshold t is determined in various
ways depending on the type of classifier to be designed (e.g. Bayes, Neyman-
Pearson, minimax, etc.). In this chapter, the true density functions are replaced
by their estimates discussed in Chapter 6. When the Parzen density estimate
is
with a kernel function IC,(.) used, the likelihood ratio classifier becomes

where S = {X\’), . . . ,X$!,X\2), . . . ,X$! } is the given data set. Equation (7.1)
classifies a test sample X into either o1 or 02, depending on whether the left-
hand side is smaller or larger than a threshold t.

Error estimation: In order to estimate the error of this classifier from
the given data set, S, we may use the resubstitution (R) and leave-one-out (L)
methods to obtain the lower and upper bounds for the Bayes error. In the R
method, all available samples are used to design the classifier, and the same
sample set is tested. Therefore, when a sample Xi” from o1 is tested, the fol-
lowing equation is used.

314 315 316 317 318 319 320 321 322 323 324