Page 52 - Applied Probability
P. 52
2. Counting Methods and the EM Algorithm
35
is the frequency of a homozygous genotype A i /A i and (1 − f)2p i p j
is the frequency of a heterozygous genotype A i /A j . Suppose that we
observe n ij people of genotype A i /A j in a random sample. Formulate
an EM algorithm for the estimation of the parameters f, p 1 ,... ,p k
from the observed data.
8. Consider the data from the London Times [15] for the years 1910 to
1912 reproduced in Table 2.6. The two columns labeled “Deaths i”
refer to the number of deaths of women 80 years and older reported
by day. The columns labeled “Frequency n i ” refer to the number of
days with i deaths. A Poisson distribution gives a poor fit to these
data, possibly because of different patterns of deaths in winter and
summer. A mixture of two Poissons provides a much better fit. Under
the Poisson admixture model, the likelihood of the observed data is
9
i i n i
µ 1 µ 2
αe −µ 1 +(1 − α)e −µ 2 ,
i! i!
i=0
where α is the admixture parameter and µ 1 and µ 2 are the means of
the two Poisson distributions.
TABLE 2.6. Death Notices from the London Times
Deaths i Frequency n i Deaths i Frequency n i
0 162 5 61
1 267 6 27
2 271 7 8
3 185 8 3
4 111 9 1
t
Formulate an EM algorithm for this model. Let θ =(α, µ 1 ,µ 2 ) and
αe −µ 1 µ i 1
z i (θ) =
i
αe −µ 1 µ +(1 − α)e −µ 2 µ i
1 2
be the posterior probability that a day with i deaths belongs to Pois-
son population 1. Show that the EM algorithm is given by
n i z i (θ m )
= i
α m+1
i n i
n i z i (θ m )i
i
µ m+1,1 =
i n i z i (θ m )
n i [1 − z i (θ m )]i
i
µ m+1,2 = .
i n i [1 − z i (θ m )]