Page 126 - Introduction to Statistical Pattern Recognition

P. 126

108 Introduction to Statistical Pattern Recognition

-v v( 1-v) -v-2
2
E(y) Ex + ~ x E{(X-X)2 I
v(1-v) -v
= [1+- IX
2
- 1/4
"= (1 - 0.094)~ . (3.170)

This is reasonably close to 5 which is the first order approximation. Prob-
ably, the first order approximation in this case would be acceptable for qualita-
tive discussions.
The second point is that, by changing v of the transformation, the
weights of the first and second terms of the Bhattacharyya distance vary. The
smaller v is, the more the first term tends to dominate. That is, the class separ-
ability comes more from the mean-difference than the covariance-difference.
This means that we may have a better chance to design a linear classifier after
the transformation with a small v.
Furthermore, when two gamma density functions of x share the same P,
we can achieve Var( y I w1 } = Var( y I w2 ) by using another popular log-
tr-ansformation y = In x [18]. Suppose that x has a gamma density of (2.54)
and we apply y = In x, then

(3.171)

where l-"(P+l> = dr(x)/dx I ptl . Therefore,

(3.173)

The integrations of (3.171) and (3.172) are obtained from an integral table [19].
Note from (3.173) that Var(y) is independent of a. Therefore, if two classes
have different a's but the same P, the variance-difference between the two
classes disappears, and the class separability comes from the mean-difference
only. Thus, after the transformation, the Bhattacharyya distance in the y-space
becomes

121 122 123 124 125 126 127 128 129 130 131