Page 90 - Introduction to Statistical Pattern Recognition
P. 90

72                         Introduction to Statistical Pattern Recognition







                                                                                  (3.63)


                      Likewise, the variance can be computed as

                                                                                  (3.64)
                      When the w2-distribution is normal,

                              E { (d2)2 1%  ] = E ((X-M)7(X-M)(X-M)T(X-M) I O, )
                                           + 4 M'E  ( (X-M>(X-Mf  1%  }M

                                           + (MTM)2 + 2 E{ (X-M)T(X-M)  I02 )MTM
                                             n                n
                                                 +
                                                          +
                                         = 3~h? ~~hihj 4zhim;
                                            i=l    itj       i=l
                                                                                  (3.65)


                      where mi  is  the  ith  component of  M. Subtracting E2(d21q) of  (3.63), we
                      obtain

                                                                                  (3.66)



                           Example 8:  For Data I-I with n variables, hi = 1.  Therefore,
                                     E(d210,) =n  and  Var{d2101] =2n,            (3.67)

                              E(d21w) =n +M'M      and  Var(d2102] =2n +4M7M. (3.68)

                       If  we  assume normal distributions for d2, we can design the Bayes classifier
                       and compute the Bayes error in the d-space, E~. The normality assumption for
                       d2 is reasonable for high-dimensional data because d2 is  the  summation of  n
                       terms as seen in (3.51), and the central limit theorem can be applied.  The &d  is
                       determined by  n and MTM, while MTM specifies the  Bayes error  in  the X-
                       space,   In  order to  show  how  much  classification information is  lost by
                       mapping the n-dimensional X into the one-dimensional d2, the relation between
   85   86   87   88   89   90   91   92   93   94   95