Page 248 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB

P. 248

CLUSTERING 237

For the M step, we have to optimize the expectation of the log-
likelihood, that is E[L(YjC)jZ] ¼ L(X, ZjC). We do this by substitut-
x
ing x n,k for x n,k into equation (7.16), taking the derivative with respect
to C and setting the result to zero. Solving the equations will yield
expressions for the parameters C ¼f k , m , C k g in terms of the data z n
k
x
and x n,k .
Taking the derivative of L(X, ZjC) with respect to m gives:
k
N S
X T 1
x x n;k ðz n m Þ C ¼ 0 ð7:20Þ
k k
n¼1
Rewriting this, gives the update rule for m :
k
N
P S
x x n;k z n
n¼1
^ m m ¼ ð7:21Þ
k
N
P S
x x n;k
n¼1
The estimation of C k is somewhat more complicated. With the help of
(b.39), we can derive:
q 1 1 T
m
m
m
ln Nðz n j^ m ; C k Þ¼ C k ðz n ^ m Þðz n ^ m Þ ð7:22Þ
k
k
k
1
qC k 2 2
This results in the following update rule for C k :
N
P S T
m
m
x x n;k ðz n ^ m Þðz n ^ m Þ
k
k
^ n¼1
C C k ¼ ð7:23Þ
N
P S
x x n;k
n¼1
Finally, the parameters k cannot be optimized directly because of the
extra constraint, namely that P K k ¼ 1. This constraint can be
k¼1
enforced by introducing a Lagrange multiplier and extending the log-
likelihood (7.16) by:
N S K K !
X X X
0
L ðYjCÞ¼ x n;k ln Nðz n jm ; C k Þþ x n;k ln k k 1
k
n¼1 k¼1 k¼1
ð7:24Þ

243 244 245 246 247 248 249 250 251 252 253