Page 205 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 205
194 FEATURE EXTRACTION AND SELECTION
6.1.3 Other criteria
The criteria discussed above are certainly not the only ones used in
feature selection and extraction. In fact, large families of performance
measures exist; see Devijver and Kittler (1982) for an extensive over-
view. One family is the so-called probabilistic distance measures. These
measures are based on the observation that large differences between the
conditional densities p(zj! k ) result in small error rates. Let us assume a
two-class problem with conditional densities p(zj! 1 ) and p(zj! 2 ). Then,
a probabilistic distance takes the following form:
1
Z
J ¼ gðpðzj! 1 Þ; pðzj! 2 ÞÞdz ð6:20Þ
1
The function g( , ) must be such that J is zero when p(zj! 1 ) ¼ p(zj! 2 ),
8z, and non-negative otherwise. In addition, we require that J attains its
maximum whenever the two densities are completely non-overlapping.
Obviously, the Bhattacharyya distance (6.14) and the Chernoff distance
(6.16) are examples of performance measures based on a probabilistic
distance. Other examples are the Matusita measure and the divergence
measures:
s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Z
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p
J MATUSITA ¼ pðzj! 1 Þ pðzj! 2 Þ dz ð6:21Þ
Z
pðzj! 1 Þ
J DIVERGENCE ¼ ðpðzj! 1 Þ pðzj! 2 ÞÞ ln dz ð6:22Þ
pðzj! 2 Þ
These measures are useful for two-class problems. For more classes, a
measure is obtained by taking the average of pairs. That is:
K K
X X
J ¼ Pð! k ÞPð! l ÞJ k;l ð6:23Þ
k¼1 l¼0
where J k,l is the measure found between the classes ! k and ! l .
Another family is the one using the probabilistic dependence of the
measurement vector z on the class ! k . Suppose that the conditional
density of z is not affected by ! k , i.e. p(zj! k ) ¼ p(z), 8z, then an observa-
tion of z does not increase our knowledge on ! k . Therefore, the ability of