Page 25 - Introduction to Statistical Pattern Recognition
P. 25
1 Introduction 7
features from the observed samples. This problem is calledfeature selection or
extraction and is another important subject of pattern recognition. However, it
should be noted that, as long as features are computed from the measurements,
the set of features cannot carry more classification information than the meas-
urements. As a result, the Bayes error in the feature space is always larger
than that in the measurement space.
Feature selection can be considered as a mapping from the n-dimensional
space to a lower-dimensional feature space. The mapping should be carried
out without severely reducing the class separability. Although most features
that a human being selects are nonlinear functions of the measurements, finding
the optimum nonlinear mapping functions is beyond our capability. So, the
discussion in this book is limited to linear mappings.
In Chapter 9, feature extraction for- signal representation is discussed in
which the mapping is limited to orthonormal transformations and the mean-
square error is minimized. On the other hand, in feature extruetion for- classif-
cation, mapping is not limited to any specific form and the class separability is
used as the criterion to be optimized. Feature extraction for classification is
discussed in Chapter 10.
It is sometimes important to decompose a given distribution into several
clusters. This operation is called clustering or unsupervised classification (or
learning). The subject is discussed in Chapter 1 1.
1.2 Process of Classifier Design
Figure 1-6 shows a flow chart of how a classifier is designed. After data
is gathered, samples are normalized and registered. Normalization and regis-
tration are very important processes for a successful classifier design. How-
ever, different data requires different normalization and registration, and it is
difficult to discuss these subjects in a generalized way. Therefore, these sub-
jects are not included in this book.
After normalization and registration, the class separability of the data is
measured. This is done by estimating the Bayes error in the measurement
space. Since it is not appropriate at this stage to assume a mathematical form
for the data structure, the estimation procedure must be nonparametric. If the
Bayes error is larger than the final classifier error we wish to achieve (denoted
by E~), the data does not carry enough classification information to meet the
specification. Selecting features and designing a classifier in the later stages