Page 214 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 214

LINEAR FEATURE EXTRACTION                                    203


            measurement                  feature                   assigned
               vector                    vector                      class
                         feature extraction        pattern classification
                                                         ˆ
                z ∈  N        W( )         y ∈  D        ω ( )       ˆ ω ∈Ω
            Figure 6.7  Feature extraction

            transform chosen is too complex, the ability to generalize from a small
            data set will be poor. On the other hand, if the transform chosen is too
            simple, it may constrain the decision boundaries to a form which is
            inappropriate to discriminate between classes. Another disadvantage is
            that all measurements will be used, even if some of them are useless. This
            might be unnecessarily expensive.
              This section discusses the design of linear feature extractors. The
            transformation W() is restricted to the class of linear operations. Such
            operations can be written as a matrix–vector product:

                                         y ¼ Wz                        ð6:31Þ

            where W is a D   N matrix. The reason for the restriction is threefold.
            First, a linear feature extraction is computationally efficient. Second, in
            many classification problems – though not all – linear features are
            appropriate. Third, a restriction to linear operations facilitates the math-
            ematical handling of the problem.
              An illustration of the computational efficiency of linear feature extraction
            is the Gaussian case. If covariance matrices are unequal, the number of
                                        2
            calculations is on the order of KN ; see equation (2.20). Classification based
                                                 2
            on linear features requires about DN þ KD calculations. If D is very small
            compared with N, the extraction saves a large number of calculations.
              The example of Gaussian densities is also well suited to illustrate the
            appropriateness of linear features. Clearly, if the covariance matrices
            are equal, then (2.25) shows that linear features are optimal. On the
            other hand, if the expectation vectors are equal and the discriminatory
            information is in the differences between the covariance matrices,
            linear feature extraction may still be appropriate. This is shown in
            the example of Figure 2.10(b) where the covariance matrices are
            eccentric, differing only in their orientations. However, in the example
            shown in Figure 2.10(a) (concentric circles) linear features seem to be
            inappropriate. In practical situations, the covariance matrices will
            often differ in both shape and orientations. Linear feature extraction
            is likely to lead to a reduction of the dimension, but this reduction may
            be less than what is feasible with nonlinear feature extraction.
   209   210   211   212   213   214   215   216   217   218   219