Page 205 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 205

194                         FEATURE EXTRACTION AND SELECTION

            6.1.3  Other criteria

            The criteria discussed above are certainly not the only ones used in
            feature selection and extraction. In fact, large families of performance
            measures exist; see Devijver and Kittler (1982) for an extensive over-
            view. One family is the so-called probabilistic distance measures. These
            measures are based on the observation that large differences between the
            conditional densities p(zj! k ) result in small error rates. Let us assume a
            two-class problem with conditional densities p(zj! 1 ) and p(zj! 2 ). Then,
            a probabilistic distance takes the following form:

                                     1
                                  Z
                               J ¼     gðpðzj! 1 Þ; pðzj! 2 ÞÞdz       ð6:20Þ
                                     1

            The function g(  ,  ) must be such that J is zero when p(zj! 1 ) ¼ p(zj! 2 ),
            8z, and non-negative otherwise. In addition, we require that J attains its
            maximum whenever the two densities are completely non-overlapping.
            Obviously, the Bhattacharyya distance (6.14) and the Chernoff distance
            (6.16) are examples of performance measures based on a probabilistic
            distance. Other examples are the Matusita measure and the divergence
            measures:


                                    s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
                                      Z

                                                      ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
                                         p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  p
                        J MATUSITA  ¼      pðzj! 1 Þ    pðzj! 2 Þ  dz  ð6:21Þ
                                    Z
                                                         pðzj! 1 Þ
                      J DIVERGENCE  ¼  ðpðzj! 1 Þ  pðzj! 2 ÞÞ ln  dz   ð6:22Þ
                                                         pðzj! 2 Þ
            These measures are useful for two-class problems. For more classes, a
            measure is obtained by taking the average of pairs. That is:


                                     K   K
                                    X X
                                 J ¼       Pð! k ÞPð! l ÞJ k;l         ð6:23Þ
                                    k¼1 l¼0

            where J k,l  is the measure found between the classes ! k and ! l .
              Another family is the one using the probabilistic dependence of the
            measurement vector z on the class ! k . Suppose that the conditional
            density of z is not affected by ! k , i.e. p(zj! k ) ¼ p(z), 8z, then an observa-
            tion of z does not increase our knowledge on ! k . Therefore, the ability of
   200   201   202   203   204   205   206   207   208   209   210