Page 126 - Introduction to Statistical Pattern Recognition
P. 126
108 Introduction to Statistical Pattern Recognition
-v v( 1-v) -v-2
2
E(y) Ex + ~ x E{(X-X)2 I
v(1-v) -v
= [1+- IX
2
- 1/4
"= (1 - 0.094)~ . (3.170)
This is reasonably close to 5 which is the first order approximation. Prob-
ably, the first order approximation in this case would be acceptable for qualita-
tive discussions.
The second point is that, by changing v of the transformation, the
weights of the first and second terms of the Bhattacharyya distance vary. The
smaller v is, the more the first term tends to dominate. That is, the class separ-
ability comes more from the mean-difference than the covariance-difference.
This means that we may have a better chance to design a linear classifier after
the transformation with a small v.
Furthermore, when two gamma density functions of x share the same P,
we can achieve Var( y I w1 } = Var( y I w2 ) by using another popular log-
tr-ansformation y = In x [18]. Suppose that x has a gamma density of (2.54)
and we apply y = In x, then
(3.171)
where l-"(P+l> = dr(x)/dx I ptl . Therefore,
(3.173)
The integrations of (3.171) and (3.172) are obtained from an integral table [19].
Note from (3.173) that Var(y) is independent of a. Therefore, if two classes
have different a's but the same P, the variance-difference between the two
classes disappears, and the class separability comes from the mean-difference
only. Thus, after the transformation, the Bhattacharyya distance in the y-space
becomes