Page 213 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 213

202                         FEATURE EXTRACTION AND SELECTION

            Listing 6.1
            PRTools code for performing feature selection.

            % Create a labeled dataset with 8 features, of which only 2
            % are useful, and apply various feature selection methods
            z ¼ gendatd(200,8,3,3);

            w ¼ featselm(z, ‘maha-s’, ‘forward’,2);    % Forward selection
            figure; clf; scatterd(z*w);
            title([‘forward: ’ num2str(þwf2g)]);
            w ¼ featselm(z,‘maha-s’, ‘backward’,2);    % Backward selection
            figure; clf; scatterd(z*w);
            title([‘backward: ’ num2str(þwf2g)]);
            w ¼ featselm(z, ‘maha-s’, ‘b&b’,2);        % B&B selection
            figure; clf; scatterd(z*w);
            title([‘b&b: ’ num2str(þwf2g)]);

            The function gendatd creates a data set in which just the first two
            measurements are informative while all other measurements only contain
            noise (there the classes completely overlap). The listing shows three pos-
            sible feature selection methods. All of them are able to retrieve the correct
            two features. The main difference is in the required computing time:
            finding two features out of eight is approached most efficiently by the
            forward selection method, while backward selection is the most inefficient.



            6.3   LINEAR FEATURE EXTRACTION

            Another approach to reduce the dimension of the measurement vector is
            to use a transformed space instead of the original measurement space.
            Suppose that W(.) is a transformation that maps the measurement space
                                    D
             N
            R onto a reduced space R , D   N. Application of the transformation
                                                            D
            to a measurement vector yields a feature vector y 2 R :
                                        y ¼ WðzÞ                       ð6:30Þ

            Classification is based on the feature vector rather than on the measure-
            ment vector; see Figure 6.7.
              The advantage of feature extraction above feature selection is that no
            information from any of the elements of the measurement vector needs
            to be wasted. Furthermore, in some situations feature extraction is easier
            than feature selection. A disadvantage of feature extraction is that it
            requires the determination of some suitable transformation W(). If the
   208   209   210   211   212   213   214   215   216   217   218