Page 165 - Introduction to Statistical Pattern Recognition
P. 165

4  Parametric Classifiers                                    147



                    Other Desired Outputs and Search Techniques

                        In pattern recognition, the classifier should be designed  by  using samples
                    near  the  decision  boundary;  samples  far from  the  decision  boundary  are  less
                    important to the design.  However, if  we fix the desired output y(X)  and try  to
                    minimize  the  mean-square error  between  h (X) and y(X), larger  h (X)’s contri-
                    bute more to the mean-square error.  This has long been  recognized  as a disad-
                    vantage of a mean-square error approach  in pattern recognition.  In this section,
                    we discuss a modification which  reduces this effect.


                        New  notation for the discriminant function: Before proceeding,  let  us
                    introduce  new  notations  which  will  simplify  the  discussion  later.  Instead  of
                    (4.18),  we will write the linear discriminant  function as

                                   h (x) = -V~X - \io  > o   for  x  E  o1 ,   (4.69)


                                             +
                                   h(~) V~X v(l > o     for  x E  o2           (4.70)
                                                                    .
                                       =
                    Furthermore, if  we introduce a new vector to express a sample as

                                   z = [-I  -XI  . . .  -xfflT  for  x  E  ol   (4.71)
                                                                   ,

                                   z = [+I s, . . . xf1lT for  x  E  o2 ,      (4.72)

                    then, the discriminant  function becomes simply

                                                                               (4.73)


                    where zo is either +I  or -1,  and  MJ, = I,,  (i = 0,1,  . . . ,n).
                        Thus, our design procedure  is
                           (1)  to generate a new set of  vectors Z’s from X’s, and
                           (2)  to find WT so as to satisfy (4.73) for as many Z’s as possible.

                        Desired outputs: Using  the  notation  of  (4.73), new  desired  outputs will
                    be  introduced.  Also, the expectation  in  (4.64) is replaced  by  the  sample mean
                    to obtain the following mean-square  errors:
   160   161   162   163   164   165   166   167   168   169   170