Page 263 - Introduction to Statistical Pattern Recognition
P. 263

5  Parameter Estimation                                       245









                                                                              (5.180)

                         -2        A   --I   A
                   where d, (X) = (X-M,)TCj  (X-M,).  Thus, the expectation of  the bootstrap bias
                   for a quadratic classifier given a sample set SI = {X\'), . . . ,X#;,X\*),  . . . ,X$i  }
                   becomes

                                                                              (5.181)

                   where


                                                                              (5.182)
                                                                            -2
                   Note that  (5.151) and  (5.182)  are very  similar.  The differences are d,  vs.  df
                         vs.
                   and i hL.  The discriminant  function h of  (5.182) is designed with h, and
                   C,, the  sample  mean  and  sample covariance  of  the  sample  set  SI. The  test
                                                                       A
                   samples Xy) are the members of  the  same set, SI. Therefore, h  is the same as
                                          A
                   the R  discriminant function  hR, while  hL of  (5.151) is the L  discriminant func-
                   tion.  For  a  given  SI, hR is  a  fixed  function.  However,  if  a  random  set, S,
                   replaces  the  fixed  set,  S I, the  discriminant function  becomes  a  random  vari-
                   able, h,.   As  shown  in  (5.148) and (5.149), the difference between h,  and hR
                   is proportional  to l/N.  Thus, the difference between dwhL and doh' is propor-
                                                                          -2
                   tional  to  liN.  Also, it can be  shown that  the difference between d,  and d:  is
                   proportional  to  1/N.  Thus,  ignoring  terms  with  1/N,  E,,  of  (5.150)  and
                       ..*
                   E* {E,, I  S} of (5.18 1)  (note that  S is now  a random set) become equal and have
                   the  same  statistical  properties.  Practically,  this  means  that  estimating  the
                   expected  error  rate  using  the  L  and  bootstrap methods  should  yield  the  same
                   results.
                        These conclusions have been confirmed  experimentally.
   258   259   260   261   262   263   264   265   266   267   268