Page 325 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB

P. 325

314 WORKED OUT EXAMPLES

PCA preprocessing is applied such that 90% of the variance is retained,
the performance of all the methods significantly decreases. To avoid this
clearly undesired effect, we will first rescale the data to have unit
variance and apply PCA on the resulting data. The basic training proced-
ure now becomes:

Listing 9.3

load housing.mat;
% Define a preprocessing
w_pca ¼ scalem([],‘variance’)*pca([],0.9);
% Define the classifier
w ¼ w_sc*ldc;
% Perform 5-fold cross-validation
err_ldc_pca ¼ crossval(z,w,5)

It appears that, compared with normal scaling, the application of
pca([],0:9) does not significantly improve the performances. For some
methods, the performance increases slightly (16.6% ( 0:6%) error for
qdc, 13.6% ( 0:9%) for knnc), but for other methods, it decreases.
This indicates that the high-variance features are not much more informa-
tive than the low-variance directions.

9.1.4 Feature selection

The use of a simple supervised feature extraction method, such as the
Bhattacharrya mapping (implemented by replacing the call to pca by
bhatm([])), also decreases the performance. We will therefore have to
use better feature selection methods to reduce the influence of noisy
features and to gain some performance.
We will first try branch-and-bound feature selection to find five
features, with the simple inter–intra class distance measure as a criterion,
finding the optimal number of features. Admittedly, the number of
features selected, five, is arbitrary, but the branch-and-bound method
does not allow for finding the optimal subset size.

Listing 9.4

% Load the housing dataset
load housing.mat;
% Construct scaling and feature selection mapping
w_fsf ¼ featselo([],‘in-in’,5)*scalem([],‘variance’);

320 321 322 323 324 325 326 327 328 329 330