Page 197 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 197
186 FEATURE EXTRACTION AND SELECTION
vector does not contain class information, and that the class dis-
tinction is obfuscated by this noise.
. The measure is invariant to reversible linear transforms. Suppose
that the measurement space is transformed to a feature space, i.e.
y ¼ Az with A an invertible matrix, then the measure expressed
in the y space should be exactly the same as the one expressed in
the z space. This property is based on the fact that both spaces carry
the same class information.
. The measure is simple to manipulate mathematically. Preferably,
the derivatives of the criteria are obtained easily as it is used as an
optimization criterion.
From the various measures known in literature (Devijver and Kittler, 1982),
two will be discussed. One of them – the interclass/intraclass distance
(Section 6.1.1) – applies to the multi-class case. It is useful if class informa-
tion is mainly found in the differences between expectation vectors in the
measurementspace,whileatthesametimethescattering ofthemeasure-
ment vectors (due to noise) is class-independent. The second measure – the
Chernoff distance (Section 6.1.2) – is particularly useful in the two-class
case because it can then be used to express bounds on the error rate.
Section 6.1.3 concludes with an overview of some other performance
measures.
6.1.1 Inter/intra class distance
The inter/intra distance measure is based on the Euclidean distance
between pairs of samples in the training set. We assume that the class-
dependent distributions are such that the expectation vectors of the
different classes are discriminating. If fluctuations of the measurement
vectors around these expectations are due to noise, then these fluctu-
ations will not carry any class information. Therefore, our goal is to
arrive at a measure that is a monotonically increasing function of the
distance between expectation vectors, and a monotonically decreasing
function of the scattering around the expectations.
As in Chapter 5, T S is a (labelled) training set with N S samples. The
classes ! k are represented by subsets T k T S , each class having N k
samples ( N k ¼ N S ). Measurement vectors in T S – without reference
to their class – are denoted by z n . Measurement vectors in T k (i.e. vectors
coming from class ! k ) are denoted by z k, n .