Page 206 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB

P. 206

FEATURE SELECTION 195

z to discriminate class ! k from the rest can be expressed in a measure of
the probabilistic dependence:

Z
1
gðpðzj! k Þ; pðzÞÞdz ð6:24Þ
1
where the function g( , ) must have likewise properties as in (6.20). In
order to incorporate all classes, a weighted sum of (6.24) is formed to get
the final performance measure. As an example, the Chernoff measure
now becomes:

K Z
X s 1 s
J C dep ðsÞ¼ Pð! k Þ p ðzj! k Þp ðzÞdz ð6:25Þ
k¼1

Other dependence measures can be derived from the probabilistic dis-
tance measures in likewise manner.
A third family is founded on information theory and involves the
posterior probabilities P(! k jz). An example is Shannon’s entropy meas-
ure. For a given z, the information of the true class associated with z is
quantified by Shannon by entropy:

K
X
HðzÞ¼ Pð! k jzÞ log Pð! k jzÞ ð6:26Þ
2
k¼1
Its expectation

Z
J SHANNON ¼ E½HðzÞ ¼ HðzÞpðzÞdz ð6:27Þ

is a performance measure suitable for feature selection and extraction.

6.2 FEATURE SELECTION

This section introduces the problem of selecting a subset from the N-
dimensional measurement vector such that this subset is most suitable
for classification. Such a subset is called a feature set and its elements
are features. The problem is formalized as follows. Let F(N) ¼
fz n jn ¼ 0, ... , N 1g be the set with elements from the measurement
vector z. Furthermore, let F j (D) ¼fy d jd ¼ 0, .. . , D 1g be a subset of

201 202 203 204 205 206 207 208 209 210 211