Page 156 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 156
PARAMETRIC LEARNING 145
5.2.3 Gaussian distribution, mean and covariance matrix
both unknown
If both the expectation vector and the covariance matrix are unknown,
the estimation problem becomes more complicated because then we
have to estimate the expectation vector and covariance matrix simultan-
eously. It can be deduced that the following estimators for C k and m are
k
unbiased:
1 X
N k
^ m m ¼ z n
k
N k
n¼1
ð5:14Þ
N k
^ 1 X T
m
m
C C k ¼ ðz n ^ m Þðz n ^ m Þ
k
k
N k 1
n¼1
^
C C k is called the sample covariance. Comparing (5.12) with (5.14) we
note two differences. In the latter expression the unknown expectation
has been replaced with the sample mean. Furthermore, the divisor N k
has been replaced with N k 1. Apparently, the lack of knowledge of m k
in (5.14) makes it necessary to sacrifice one degree of freedom in the
averaging operator. For large N K , the difference between (5.12) and
(5.14) vanishes.
In classification problems, often the inverse C 1 is needed, for
k
T
1
^ 1
1
T
instance, in quantities like: z C m, z C z, etc. Often, C k is used as
C
k
k
1
an estimate of C . To determine the number of samples required such
k
^
1
^ 1
C
C
that C k becomes an accurate estimate of C , the variance of C k , given
k
in (5.13), is not very helpful. To see this it is instructive to rewrite the
inverse as (see Appendix B.5 and C.3.2):
1
C 1 ¼ V k V T ð5:15Þ
k k k
where k is a diagonal matrix containing the eigenvalues of C k . Clearly,
the behaviour of C 1 is strongly affected by small eigenvalues in k .
k
^
In fact, the number of nonzero eigenvalues in the estimate C k given in
C
(5.14) cannot exceed N k 1. If N k 1 is smaller than the dimension
^
C
N of the measurement vector, the estimate C k is not invertible. There-
fore, we must require that N k is (much) larger than N. As a rule of
thumb, the number of samples must be chosen such that at least
N k > 5N.