Page 72 - Applied Probability
P. 72

55
                                                              3. Newton’s Method and Scoring
                                   for 1 ≤ i ≤ m − 1 and 1 −
                                                            j=1
                                   expected information
                                                           0
                                                    θ
                                                                     0
                                                               ···
                                                                        
                                                     1
                                                          −1
                                                                     0
                                                   0
                                                                                 n
                                                               ···
                                                          2
                                                                                          t
                                                                        
                                                                          +
                                       J(θ)  = n   −1   θ   m−1  θ j for i = m, then show that X has
                                                                                        11 ,
                                                           .
                                                      .
                                                                     .
                                                               .
                                                                        
                                                  
                                                   .      .   .     .            m−1
                                                      .    .    .    .     1 −   j=1  θ j
                                                     0     0   ··· θ −1
                                                                    m−1
                                   where 1 is a column vector of ones.
                                12. In the setting of the EM algorithm, suppose that Y is the observed
                                   data and X is the complete data. Let Y and X have expected infor-
                                   mation matrices J(θ) and I(θ), respectively. Prove that I(θ)  J(θ)
                                   in the notation of Problem 9. If we could redesign our experiment so
                                   that X is observed directly, then invoke Problem 10 and argue that
                                   the standard error of the maximum likelihood estimate of any com-
                                   ponent θ i will tend to decrease. (Hints: Using the notation of Section
                                   2.4, let h(X | θ)= f(X | θ)/g(Y | θ) and prove that
                                                                   2
                                             I(θ) − J(θ)  =  E{E[−d ln h(X | θ) | Y, θ]}.
                                   The inner expectation on the right of this equation is an expected
                                   information.)
                                13. As an application of Problems 10, 11 and 12, consider the estima-
                                   tion of haplotype frequencies from a random sample of people who
                                   are genotyped at the same linked, codominant loci. The resulting
                                   multilocus genotypes lack phase. Find an explicit upper bound on
                                   the expected information matrix for the haplotype frequencies and
                                   an explicit lower bound on the standard error of each estimated fre-
                                   quency. (Hints: The complete data specify phase. For the standard-
                                   error bound, use the Sherman-Morrison formula.)
                                                     t
                                14. Let Y =(Y 1 ,...,Y k ) follow a Dirichlet distribution with parameters
                                   α 1 ,... ,α k . Compute Var(Y i ) and Cov(Y i ,Y j ) for i  = j. Also show
                                                         t
                                   that (Y 1 + Y 2 ,Y 3 ,...,Y k ) has a Dirichlet distribution.
                                15. In the notation of Problem 14, find the score and expected infor-
                                   mation of a single observation from the Dirichlet distribution. (Hint:
                                   In calculating the expected information, take the expectation of the
                                   observed information rather than the covariance matrix of the score.)
                                16. Suppose n unrelated people are sampled at a codominant locus with k
                                   alleles. If N i = n i genes of allele type i are counted, and if a Dirichlet
                                   prior is assumed with parameters α 1 ,... ,α k , then we have seen that
                                                                         t
                                   the allele frequency vector p =(p 1 ,... ,p k ) has a posterior Dirichlet
   67   68   69   70   71   72   73   74   75   76   77