Page 159 - Introduction to Statistical Pattern Recognition
P. 159
4 Parametric Classifiers 141
Since Procedures I1 and I11 produce different s’s, V’s, v,’s, and E’S, we
need to know which s, V, v,, and E to use. Once a classifier has been designed
by using N samples and implemented, the classifier is supposed to classify
samples which were never used in design. Therefore, the error of Procedure I11
is the one to indicate the performance of the classifier in operation. However,
the error of Procedure 111 alone does not tell how much the error can be
reduced if we use a larger number of design samples. The error of the ideal
classifier, which is designed with an infinite number of design samples, lies
somewhere between the errors of Procedures I1 and 111. Therefore, in order to
predict the asymptotic error experimentally, it is common practice to run both
Procedures I1 and 111. As far as the parameter selection of the classifier is con-
cerned, we can get better estimates of these parameters by using a larger
number of design samples. Therefore, if the available sample size is fixed, we
had better use all samples to design the classifier. Thus, the s, V, and v,
obtained by Procedure I1 are the ones which must be used in classifier design.
Before leaving this subject, the reader should be reminded that the cri-
teria discussed in this section can be used to evaluate the performance of a
linear classifier regardless of whether the classifier is optimum or not. For a
given linear classifier and given test distributions, yi and of are computed
from (4.19) and (4.20), and they are inserted into a chosen criterion to evaluate
its performance. When the distributions of X are normal for both o1 and ~ 2 ,
h (X) becomes normal. Thus, we can use the error of (4.38).
Optimum Design of a Nonlinear Classifier
So far, we have limited our discussion to a linear classifier. However,
we can extend the previous discussion to a more general nonlinear classifier.
General nonlinear classifier: Let y(X) be a general discriminant func-
tion with X classified according to
(4.47)
Also, let f(y,,q2,s:,s3) be. the criterion to be optimized with respect to y(X),
where