Page 199 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 199

188                         FEATURE EXTRACTION AND SELECTION

              An alternative way to represent these distances is by means of scatter
            matrices. A scatter matrix gives some information about the dispersion
            of a population of samples around their mean. For instance, the matrix
            that describes the scattering of vectors from class ! k is:


                                     N k
                                  1  X                    T
                                              m
                                                       m
                             S k ¼      ðz k;n   ^ m Þðz k;n   ^ m Þ    ð6:5Þ
                                                         k
                                               k
                                 N k
                                     n¼1
            Comparison with equation (5.14) shows that S k is close to an unbiased
            estimate of the class-dependent covariance matrix. In fact, S k is the
            maximum likelihood estimate of C k . With that, S k does not only supply
            information about the average distance of the scattering, it also supplies
            information about the eccentricity and orientation of this scattering.
            This is analogous to the properties of a covariance matrix.
              Averaged over all classes the scatter matrix describing the noise is:

                                         K
                           K
                                           N k
                        1  X         1  X X                      T
                                                              m
                                                    m
                  S w ¼      N k S k ¼        ðz k;n   ^ m Þðz k;n   ^ m Þ  ð6:6Þ
                                                      k
                                                               k
                       N S           N S
                          k¼1           k¼1 n¼1
            This matrix is the within-scatter matrix as it describes the average
            scattering within classes. Complementary to this is the between-scatter
            matrix S b that describes the scattering of the class-dependent sample
            means around the overall average:
                                      K
                                   1  X                   T
                                            m
                                                m
                                                        m
                                                   m
                             S b ¼      N k ð^ m   ^ mÞð^ m   ^ mÞ      ð6:7Þ
                                                    k
                                             k
                                  N S
                                     k¼1
            Figure 6.2 illustrates the concepts of within-scatter matrices and
            between-scatter matrices. The figure shows a scatter diagram of a train-
            ing set consisting of four classes. A scatter matrix S corresponds to an
                      1 T
            ellipse, zS z ¼ 1, that can be thought of as a contour roughly sur-
            rounding the associated population of samples. Of course, strictly speak-
            ing the correspondence holds true only if the underlying probability
            density is Gaussian-like. But even if the densities are not Gaussian, the
            ellipses give an impression of how the population is scattered. In the
            scatter diagram in Figure 6.2 the within-scatter S w is represented by four
            similar ellipses positioned at the four conditional sample means. The
            between-scatter S b is depicted by an ellipse centred at the mixture sample
            mean.
   194   195   196   197   198   199   200   201   202   203   204