Page 198 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 198

CRITERIA FOR SELECTION AND EXTRACTION                        187

              The development starts with the following definition of the average
            squared distance of pairs of samples in the training set:


                                     N S
                                        N S
                                  1  X X
                             2                     T
                              ¼            ðz n   z m Þ ðz n   z m Þ    ð6:1Þ
                                 N 2
                                   S n¼1 m¼1
                                  T
            The summand (z n   z m ) (z n   z m ) is the squared Euclidean distance
            between a pair of samples. The sum involves all pairs of samples in the
                                    2
            training set. The divisor N accounts for the number of terms.
                                    S
              The distance   is useless as a performance measure because none of
                           2
            the desired properties mentioned above are met. Moreover,   is defined
                                                                  2
            without any reference to the labels of the samples. Thus, it does not give
            any clue about how well the classes in the training set can be discrimin-
            ated. To correct this,    2  must be divided into a part describing the
            average distance between expectation vectors and a part describing
            distances due to noise scattering. For that purpose, estimations of the
            conditional expectations (m ¼ E[zj! k ]) of the measurement vectors are
                                    k
            used, along with an estimate of the unconditional expectation
            (m ¼ E[z]). The sample mean of class ! k is:

                                           1  X
                                             N k
                                     ^ m m ¼    z k;n                   ð6:2Þ
                                      k
                                          N k
                                             n¼1
            The sample mean of the entire training set is:

                                              N S
                                           1  X
                                      ^ m m ¼   z n                     ð6:3Þ
                                          N S
                                             n¼1
            With these definitions, it can be shown (see exercise 1) that the average
            squared distance is:

                        "
                      K
                   2  X X                                             #
                          N k
              2                     m  T      m      m  m  T  m  m
               ¼             ðz k;n   ^ m Þ ðz k;n   ^ m Þþ ð^ m   ^ m Þ ð^ m   ^ m Þ  ð6:4Þ
                                                          k
                                                                  k
                                                k
                                     k
                  N S
                     k¼1 n¼1
            The first term represents the average squared distance due to the scatter-
            ing of samples around their class-dependent expectation. The second
            term corresponds to the average squared distance between class-
            dependent expectations and the unconditional expectation.
   193   194   195   196   197   198   199   200   201   202   203