Page 156 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 156

PARAMETRIC LEARNING                                          145

            5.2.3  Gaussian distribution, mean and covariance matrix
                   both unknown

            If both the expectation vector and the covariance matrix are unknown,
            the estimation problem becomes more complicated because then we
            have to estimate the expectation vector and covariance matrix simultan-
            eously. It can be deduced that the following estimators for C k and m are
                                                                        k
            unbiased:


                                  1  X
                                     N k
                            ^ m m ¼    z n
                             k
                                 N k
                                    n¼1
                                                                       ð5:14Þ
                                        N k
                            ^       1   X                  T
                                                        m
                                                m
                            C C k ¼        ðz n   ^ m Þðz n   ^ m Þ
                                                         k
                                                 k
                                 N k   1
                                        n¼1
            ^
            C C k is called the sample covariance. Comparing (5.12) with (5.14) we
            note two differences. In the latter expression the unknown expectation
            has been replaced with the sample mean. Furthermore, the divisor N k
            has been replaced with N k  1. Apparently, the lack of knowledge of m k
            in (5.14) makes it necessary to sacrifice one degree of freedom in the
            averaging operator. For large N K , the difference between (5.12) and
            (5.14) vanishes.
              In classification problems, often the inverse C  1  is needed, for
                                                           k
                                      T
                                                  1
                                                                ^  1
                                          1
                                              T
            instance, in quantities like: z C m, z C z, etc. Often, C k  is used as
                                                               C
                                         k
                                                 k
                            1
            an estimate of C . To determine the number of samples required such
                           k
                                                                    ^
                                                    1
                ^  1
                                                                    C
                C
            that C k  becomes an accurate estimate of C , the variance of C k , given
                                                   k
            in (5.13), is not very helpful. To see this it is instructive to rewrite the
            inverse as (see Appendix B.5 and C.3.2):
                                                1
                                     C  1  ¼ V k   V T                 ð5:15Þ
                                      k        k  k
            where   k is a diagonal matrix containing the eigenvalues of C k . Clearly,
            the behaviour of C  1  is strongly affected by small eigenvalues in   k .
                              k
                                                                  ^
            In fact, the number of nonzero eigenvalues in the estimate C k given in
                                                                  C
            (5.14) cannot exceed N k   1. If N k   1 is smaller than the dimension
                                                    ^
                                                    C
            N of the measurement vector, the estimate C k is not invertible. There-
            fore, we must require that N k is (much) larger than N. As a rule of
            thumb, the number of samples must be chosen such that at least
            N k > 5N.
   151   152   153   154   155   156   157   158   159   160   161