Page 161 - Introduction to Statistical Pattern Recognition
P. 161
4 Parametric Classifiers 143
where
(4.56)
Note that af/aof = (af/asf)(as;/ao;) = af/as; since sf = of + qf. The
optimum solution, (4.54), may be interpreted as y (X) = a 1q I(X)+u2q2(X) =
a2+(a I-a2)q I (X), where q I (X) is a posteriori probability function of ol with
a priori probability of s. On the other hand, the Bayes classifier is
q I (X) 5 q2(X), and subsequently the Bayes discriminant function is
h (X) = q2(X)-9, (X) = 1-2q I (X) >< 0. Therefore, if we seek the discriminant
function by optimizing a criterion f (q1,q2,s:,sg), we obtain the Bayes
discriminant function as the solution, except that different constants are multi-
plied and added to q (X). The difference in the added constants can be elim-
inated by adjusting the threshold, and the difference in the multiplied constants
does not affect the decision rule, as was discussed previously.
The above result further justifies the use of the criterion f (q 1 ,q2,of,og).
The criterion not only provides a simple solution for linear classifier design,
but also guarantees the best solution in the Bayes sense for general nonlinear
classifier design. This guarantee enhances the validity of the criterion,
although the above analysis does not directly reveal the procedure for obtaining
the optimum nonlinear solution.
Linear classifier: When we limit the mathematical form of y(X) to
y = V‘X, the variation of y (X) comes from the variation of V. Therefore,