Page 149 - Introduction to Statistical Pattern Recognition
P. 149

4  Parametric Classifiers                                     131



                                                     n
                                                                    .
                                      Pr{X=Xloi} =~Pr{xj=xjIoi)                  (4.16)
                                                     /=I
                     Thus, the minus-log likelihood ratio of (4.16) becomes

                                         I
                                 P, { x = x 0, ]
                       h (X) = -In
                                 P,{ x = x Io2 ]













                     This is a linear function of xi.




                     4.2  Linear Classifier Design

                          Linear classifiers are the simplest ones as far as implementation is con-
                     cerned, and are directly related to many known techniques such as correlations
                     and  Euclidean distances.  However, in  the  Bayes  sense, linear classifiers are
                     optimum  only  for  normal  distributions  with  equal  covariance matrices.  In
                     some  applications  such  as  signal  detection  in  communication  systems,  the
                     assumption  of  equal  covariance is  reasonable because  the  properties  of  the
                     noise do not change very much from one signal to another.  However, in many
                     other applications of  pattern recognition, the assumption of  equal covariance is
                     not appropriate.
                          Many  attempts have been  made  to design  the  best  linear classifiers for
                     normal distributions with unequal covariance matrices and non-noma1 distribu-
                     tions.  Of  course, these are not optimum, but in many cases the simplicity and
                     robustness of  the linear classifier more than compensate for the loss in perfor-
                     mance.  In this section, we will discuss how linear classifiers are designed for
                     these more complicated cases.
                          Since it is predetermined that we  use a linear classifier regardless of  the
                     given distributions, our decision rule should be
   144   145   146   147   148   149   150   151   152   153   154