Page 213 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 213
202 FEATURE EXTRACTION AND SELECTION
Listing 6.1
PRTools code for performing feature selection.
% Create a labeled dataset with 8 features, of which only 2
% are useful, and apply various feature selection methods
z ¼ gendatd(200,8,3,3);
w ¼ featselm(z, ‘maha-s’, ‘forward’,2); % Forward selection
figure; clf; scatterd(z*w);
title([‘forward: ’ num2str(þwf2g)]);
w ¼ featselm(z,‘maha-s’, ‘backward’,2); % Backward selection
figure; clf; scatterd(z*w);
title([‘backward: ’ num2str(þwf2g)]);
w ¼ featselm(z, ‘maha-s’, ‘b&b’,2); % B&B selection
figure; clf; scatterd(z*w);
title([‘b&b: ’ num2str(þwf2g)]);
The function gendatd creates a data set in which just the first two
measurements are informative while all other measurements only contain
noise (there the classes completely overlap). The listing shows three pos-
sible feature selection methods. All of them are able to retrieve the correct
two features. The main difference is in the required computing time:
finding two features out of eight is approached most efficiently by the
forward selection method, while backward selection is the most inefficient.
6.3 LINEAR FEATURE EXTRACTION
Another approach to reduce the dimension of the measurement vector is
to use a transformed space instead of the original measurement space.
Suppose that W(.) is a transformation that maps the measurement space
D
N
R onto a reduced space R , D N. Application of the transformation
D
to a measurement vector yields a feature vector y 2 R :
y ¼ WðzÞ ð6:30Þ
Classification is based on the feature vector rather than on the measure-
ment vector; see Figure 6.7.
The advantage of feature extraction above feature selection is that no
information from any of the elements of the measurement vector needs
to be wasted. Furthermore, in some situations feature extraction is easier
than feature selection. A disadvantage of feature extraction is that it
requires the determination of some suitable transformation W(). If the