Page 206 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 206
FEATURE SELECTION 195
z to discriminate class ! k from the rest can be expressed in a measure of
the probabilistic dependence:
Z
1
gðpðzj! k Þ; pðzÞÞdz ð6:24Þ
1
where the function g( , ) must have likewise properties as in (6.20). In
order to incorporate all classes, a weighted sum of (6.24) is formed to get
the final performance measure. As an example, the Chernoff measure
now becomes:
K Z
X s 1 s
J C dep ðsÞ¼ Pð! k Þ p ðzj! k Þp ðzÞdz ð6:25Þ
k¼1
Other dependence measures can be derived from the probabilistic dis-
tance measures in likewise manner.
A third family is founded on information theory and involves the
posterior probabilities P(! k jz). An example is Shannon’s entropy meas-
ure. For a given z, the information of the true class associated with z is
quantified by Shannon by entropy:
K
X
HðzÞ¼ Pð! k jzÞ log Pð! k jzÞ ð6:26Þ
2
k¼1
Its expectation
Z
J SHANNON ¼ E½HðzÞ ¼ HðzÞpðzÞdz ð6:27Þ
is a performance measure suitable for feature selection and extraction.
6.2 FEATURE SELECTION
This section introduces the problem of selecting a subset from the N-
dimensional measurement vector such that this subset is most suitable
for classification. Such a subset is called a feature set and its elements
are features. The problem is formalized as follows. Let F(N) ¼
fz n jn ¼ 0, ... , N 1g be the set with elements from the measurement
vector z. Furthermore, let F j (D) ¼fy d jd ¼ 0, .. . , D 1g be a subset of