Page 82 - Algorithm Collections for Digital Signal Processing Applications using MATLAB
P. 82
70 Chapter 2
The requirement is to estimate the best value for m1.But for computing
the best value requires p(c 1/x 1), which in further requires p(x 1/ c 1) as discussed
above. p(x1/c1) requires m1 for computation. [Surprise!].
To proceed further, an iteration approach called as Expected –
Maximization algorithm is used to estimate the best value for m 1 as
described below.
2.1 Expectation-maximization Algorithm
1. Initialize the mean vectors randomly m 1, m 2, m 3,… m n
2. Find the Euclidean distance between the x 1 vector and the mean vectors
m 1 m 2 …m n are computed as ‘d1’, ‘d2’, … ‘dn’ respectively. If the
distance d2 is smaller among all, the x 1 vector is treated as the vector
nd
belongs to the 2 cluster. In general if the dk is smallest among all, x 1
th
vector belongs to the k cluster. Thus cluster index assigned to the vector
x 1 is ‘k’. Similarly cluster index is obtained for the entire data vector.
3. Let the number of data belongs to the first cluster be ‘k’. (ie) Number of
data vectors having cluster index 1 is ‘k. Then p (c1) = probability of the
cluster 1 is computed as
p (c 1) = k1
________________
Total Number of data
and p(c2) is computed as
p (c 2) = k2
________________
Total Number of data
and so on
4. Covariance matrix of the cluster 1 is computed as cov 1 = E ([X-μ X]
T
T
T
[X-μ X ]) = E (XX )-μ Xμ X . For instance, let us consider the vectors x 1, x 3,
x 7, and x 9 arranged in the column format, belongs to cluster1 and the
corresponding covariance matrix is computed as follows.
μ X = [x 1 + x 3 + x 7 +x 9 ]/4.
T
T
T
T
T
E (XX ) = [x 1 x 1 + x 2 x 2 + x 3 x 3 + x 4 x 4 ]/4
T
T
Thus cov 1 is computed as E (XX ) - μ Xμ X .
5. Similarly, the covariance matrices cov 2, cov 3, cov 4 … cov n are computed.