Page 43 - Applied Probability
P. 43
2. Counting Methods and the EM Algorithm
26
Thus, the difference ln g(y | θ) − Q(θ | θ n ) attains its minimum when
θ = θ n . If we choose θ n+1 to maximize Q(θ | θ n ), then it follows that
ln g(y | θ n+1 )= Q(θ n+1 | θ n ) + [ln g(y | θ n+1 ) − Q(θ n+1 | θ n )]
≥ Q(θ n | θ n ) + [ln g(y | θ n ) − Q(θ n | θ n )]
=ln g(y | θ n ),
with strict inequality when f(x | θ n+1 )/g(y | θ n+1 ) and f(x | θ n )/g(y | θ n )
are different conditional densities or when Q(θ n+1 | θ n ) >Q(θ n | θ n ). This
verifies the promised ascent property ln g(y | θ n+1 ) ≥ ln g(y | θ n )ofthe
EM algorithm.
2.5 Allele Frequency Estimation by the EM
Algorithm
Let us return to the ABO example and formalize gene counting as an EM
algorithm. The observed numbers of people in each of the four phenotypic
categories constitute the observed data Y , while the unknown numbers of
people in each of the six genotypic categories constitute the complete data
X. Let n A/A be the number of people of genotype A/A. Define n A/O , n B/B ,
and n B/O similarly and set n = n A + n B + n AB + n O . Note that the n AB
people of phenotype AB and the n O people of phenotype O are already
correctly assigned to their respective genotypes A/B and O/O. With this
notation the complete data loglikelihood becomes
2
ln f(X | p)= n A/A ln p + n A/O ln(2p A p O )+ n B/B ln p 2
A B
2
+ n B/O ln(2p B p O )+ n AB ln(2p Ap B )+ n O ln p (2.3)
O
n
+ln .
n A/A n A/O n B/B n B/O n AB n O
In the E step of the EM algorithm, we take the expectation of ln f(X | p)
conditional on the observed counts n A , n B , n AB , and n O and the current
t
parameter vector p m =(p mA ,p mB ,p mO ) . It is obvious that
E(n AB | Y, p m)= n AB
E(n O | Y, p m)= n O .
A moment’s reflection also yields
=E(n A/A | Y, p m)
n m,A/A
p 2 mA
= n A 2
p +2p mA p mO
mA
=E(n AO | Y, p m)
n m,A/O
2p mAp mO
= n A 2 .
p mA +2p mA p mO