Page 373 - Introduction to Statistical Pattern Recognition
P. 373

7  Nonparametric Classification  and Error Estimation         355



                    expected to be  much  smaller than  the L-curve  of  Fig.  7-14(c) (5.5% at k  = 5),
                    but  larger than the R-curve (1.3% at k  = 5).  Note  again  that the curves of  Fig.
                     7-14(c) were obtained as the  10 trial  average by  using  the threshold  of Option
                    4, while the dotted  line of  Fig. 7-15 was  determined  by  human judgement for
                    one trial.
                         In order to apply the Neyman-Pearson  test with   = 2% for example, we
                    maintain  the 45 .,slope  of  the line, and shift it until  we find 2 misclassified  02-
                     samples out of  100.

                         Constant risk contours: It  is frequently  desirable  to have  risk  informa-
                    tion about a point  in a data display.  For the two-class case, the risk  at X, r (X),
                     is given by




                                                  I
                     Substituting pj(X) = (k,-l)/Njco I Cj  li2dy(X), and taking  logarithms,
                                                                       1n'0
                                               (k I -1)Nz  I z2  I   + ln~}
                                                                     +
                         nlnd2(X)  = n lnd I (X) -  In
                                            I  (kZ--l)NI  1x1 I   P2     1 -r  (X)
                                                                                (7.80)
                     Thus, for a given  r-(X), a contour can  be  drawn  on the  display.  The resulting
                    contour  is  a  45'  line  shifted  by  lnr(X)/[l-r(X)]  [16].  Similarly,  for
                     PIpI(X)  > P2p2(X), the  numerator  of  (7.79)  is  replaced  by  P2p2(X), and
                     (7.80) is modified to
                                               (k I -1)lVz  I x*
                                                          I
                         nlnd2(X) = n lnd I (X) -  In        +  1 - In-
                                            I  (k2-1)N1  1x1 I   p2      1 -r-  (X)
                                                                                (7.8 1)

                     Comparison  of  (7.80)  and (7.81) indicates  that the constant  risk  lines are sym-
                     metric  about  the Bayes decision  line  where r-(X) = 0.5.  As r(X) is decreased,
                     the constant  risk  lines move  farther from the  Bayes decision  line.  This result
                     indicates that points  mapped  near the Bayes decision  line on the  display  do in
                     fact have a high  degree of  risk  associated  with  them, while points  mapped  far
                     from  the  Bayes  decision  line  have  a  low  degree of  risk.  This  desirable  pro-
   368   369   370   371   372   373   374   375   376   377   378