Page 114 - Introduction to Statistical Pattern Recognition
P. 114
96 Introduction to Statistical Pattern Recognition
1 lCll
+-ln-- t. (3.140)
2 1x21
Or, after simultaneous diagonalization from Z1 and X2 to I and A by Y = A'X,
(3.141)
1"
2
E(h(Y)Iq) = -x[(hj-l) + (d2;-dli) + In -1 I - r , (3.142)
2 i=l hi
where dki is the ith component of Dp = ATMp.
An interesting property emerges from (3.141) and (3.142). That is, if
t = 0,
E(h(Y)lwl) 10 and E(h(Y)lw2] 20 (3.143)
regardless of the distributions of X. These inequalities may be proved by using
In x I x-1. From (3.141), (l-l/k;) + In (1/3Li) IO and -(d2;-d;i)2/ki IO yield
E(h(Y)lw, IO for t = 0. Also, from (3.142), (kj-l)- In hi 2 0 and
}
(d2i-dli)2 2OyieIdE(h(Y)ly} 2Ofort=O.
Variance of h(X): The computation of the variance is more involved.
Therefore, only the results for normal distributions are presented here. The
reader is encouraged to confirm these results. It is suggested to work in the Y-
space where the two covariances are diagonalized to I and A.