Page 249 - Introduction to Statistical Pattern Recognition
P. 249

5  Parameter Estimation                                      23 1



                          Example 1: Let f be
                                                                1
                                            1
                                 ~(x,M,x) -(x-M~x-'(x-M)  + - In  1x1 .        (5.143)
                                          =
                                            2                   2
                     Then,

                      af'
                     -- - -C-'(x-M)  ,                                         (5.144)
                     aM
                     a'
                     -- - 1[Z-'-r-'(X-M)(X-M)7z-'] [from (A.41)-(A.46)]  .     (5.145)
                     ax    2

                     If a sample Y is excluded,  of (5.142) becomes

                              1
                     h(X,Y) = - [((x-M)~z-1(Y-M)p + n + 2(x-M)7x-'(Y-M)
                             2N

                             - (X-M)TZ-'(X-M)  - (Y-M)TX-'(Y-M)]  .            (5.146)


                          Example 2: Iff  is evaluated at X  = Y, h of  (5.146) becomes
                                                   1
                                          h(Y,Y) = -[d4(Y)  + n] ,             (5.147)
                                                  2N
                     where  d2(Y) = (Y-M)TX-'(Y-M).  Equation  (5.147)  is  the  same  as  (5.135)
                     except that the true parameters M  and Z are used this time instead of Gj and i,

                     for (5.135).

                     Resubstitution Error for the Quadratic Classifier

                          Error expression: When the L  method is  used, design and test  samples
                     are independent.  Therefore, when the expectation is taken on the classification
                     error  of  (5.98), we  can  isolate  two  expectations: one  with  respect  to  design
                     samples and the other with respect to test samples.  In  (5.101), the randomness
                     of  h comes from design samples, and X  is the  test  sample, independent of  the
                     design  samples.  Therefore, Ed [ doh(') }  can  be  computed for a given X.  The
                     expectation with respect to test samples is obtained by  computing j[.]pj(X) dX.
                     On  the other hand, when  the R  method is used, design and test simples are no
                     longer independent.  Thus,  we  cannot  isolate two  expectations.  Since X  is  a
   244   245   246   247   248   249   250   251   252   253   254