Page 69 - Applied Probability
P. 69

3. Newton’s Method and Scoring
                              52
                              do not appear in most population samples. In the absence of data to the
                              contrary, one can argue that it is reasonable to steer haplotype frequency
                              estimates toward linkage equilibrium. From an empirical Bayesian perspec-
                              tive, the most natural equilibrium frequencies can be found by computing
                              allele frequency estimates at each locus and taking products. We build on
                              this insight by choosing a Dirichlet prior whose mode occurs at these es-
                              timated haplotype frequencies. A short calculation shows that the mode
                              of the Dirichlet density (3.12) reduces to the point p with coordinates
                              p i = β i /β, where β i = α i − 1 and β = α . − k. Thus we choose β i so
                              that the ratio β i /β coincides with the frequency of the ith haplotype under
                              linkage equilibrium using the estimated allele frequencies. These choices do
                              not determine β, which specifies the overall strength of the prior.
                                Problem 5 of Chapter 2 discusses the standard EM algorithm for maxi-
                              mum likelihood estimation of haplotype frequencies from a random sample
                              of individuals. The Bayesian version of the EM algorithm adds β pseudo-
                              haplotypes to the various haplotype classes in proportion to their linkage
                              equilibrium frequencies β i /β. Problem 12 of this chapter shows how to in-
                              clude these pseudo-haplotypes in the haplotype counting update of the EM
                              algorithm.


                              3.9 Problems


                                 1. Let f(x) be a real-valued function whose Hessian matrix (  ∂  2  f)
                                                                                        ∂x i ∂x j
                                                                                         m
                                   is positive definite throughout some convex open set U of R .For
                                   u  = 0 and x ∈ U, show that the function t → f(x + tu) of the real
                                   variable t is strictly convex on {t : x + tu ∈ U}. Use this fact to
                                   demonstrate that f(x) can have at most one local minimum point on
                                   any convex subset of U.

                                 2. Apply the result of Problem 1 to show that the loglikelihood of the
                                   observed data in the ABO example of Chapter 2 is strictly concave
                                   and therefore possesses a single global maximum. Why does the max-
                                   imum occur on the interior of the feasible region?
                                 3. Show that Newton’s method converges in one iteration to the maxi-
                                   mum of the quadratic function

                                                                        1  t
                                                                   t
                                                     L(θ)  = d + e θ + θ Fθ
                                                                        2
                                   if the symmetric matrix F is negative definite.

                                 4. Verify the loglikelihood, score, and expected information entries in
                                   Table 3.1 for the binomial, Poisson, and exponential families.
   64   65   66   67   68   69   70   71   72   73   74