Page 161 - Introduction to Statistical Pattern Recognition
P. 161

4 Parametric Classifiers                                      143


















                    where








                                                                                (4.56)


                    Note  that  af/aof = (af/asf)(as;/ao;)  = af/as;  since  sf  = of + qf.  The
                    optimum  solution, (4.54), may  be  interpreted as y (X) = a 1q I(X)+u2q2(X) =
                    a2+(a I-a2)q I (X), where q I (X) is a posteriori probability function of ol with
                    a  priori  probability  of  s.  On  the  other  hand,  the  Bayes  classifier  is
                    q I (X) 5  q2(X),  and  subsequently  the  Bayes  discriminant  function  is
                    h (X) = q2(X)-9, (X) = 1-2q  I (X) ><  0.  Therefore, if  we  seek the discriminant
                    function  by  optimizing  a  criterion  f  (q1,q2,s:,sg), we  obtain  the  Bayes
                    discriminant function as the solution, except that different constants are multi-
                    plied and added to q  (X).  The difference in the added constants can be elim-
                    inated by  adjusting the threshold, and the difference in the multiplied constants
                    does not affect the decision rule, as was discussed previously.
                         The above result further justifies the use of  the criterion f (q 1 ,q2,of,og).
                    The criterion not  only provides a  simple solution for linear classifier design,
                    but  also guarantees the best solution in  the Bayes sense for general nonlinear
                    classifier  design.  This  guarantee  enhances  the  validity  of  the  criterion,
                    although the above analysis does not directly reveal the procedure for obtaining
                    the optimum nonlinear solution.

                         Linear  classifier:  When  we  limit  the  mathematical form  of  y(X) to
                    y  = V‘X,  the variation of y (X) comes from the variation of  V.  Therefore,
   156   157   158   159   160   161   162   163   164   165   166