Page 114 - Introduction to Statistical Pattern Recognition
P. 114

96                         Introduction to Statistical Pattern Recognition









                                            1    lCll
                                          +-ln--        t.                      (3.140)
                                            2    1x21


                     Or, after simultaneous diagonalization from Z1 and X2 to I and A by Y = A'X,


                                                                                (3.141)


                                     1"
                                                       2
                       E(h(Y)Iq) = -x[(hj-l) + (d2;-dli) +  In -1  I   - r  ,   (3.142)
                                    2 i=l                     hi
                     where dki is the ith component of Dp = ATMp.
                          An  interesting property  emerges from  (3.141) and  (3.142).  That  is,  if
                     t = 0,


                                    E(h(Y)lwl) 10 and  E(h(Y)lw2] 20            (3.143)

                     regardless of the distributions of X. These inequalities may be proved by using
                     In x  I x-1.  From (3.141), (l-l/k;)  + In (1/3Li) IO and -(d2;-d;i)2/ki IO yield
                     E(h(Y)lw, IO for  t  = 0.  Also,  from  (3.142),  (kj-l)-  In  hi 2 0  and
                               }
                      (d2i-dli)2 2OyieIdE(h(Y)ly} 2Ofort=O.


                          Variance  of  h(X): The  computation of  the  variance is  more  involved.
                      Therefore, only  the  results  for  normal  distributions are  presented  here.  The
                      reader is encouraged to confirm these results.  It is suggested to work in the  Y-
                      space where the two covariances are diagonalized to I and A.
   109   110   111   112   113   114   115   116   117   118   119