Page 56 - Applied Probability
P. 56

3
                              Newton’s Method and Scoring
                              3.1 Introduction

                              This chapter explores some alternatives to maximum likelihood estimation
                              by the EM algorithm. Newton’s method and scoring usually converge
                              faster than the EM algorithm. However, the trade-offs of programming
                              ease, numerical stability, and speed of convergence are complex, and sta-
                              tistical geneticists should be fluent in a variety of numerical optimization
                              techniques for finding maximum likelihood estimates. Outside the realm of
                              maximum likelihood, Bayesian procedures have much to offer in small to
                              moderate-sized problems. For those uncomfortable with pulling prior distri-
                              butions out of thin air, empirical Bayes procedures can be an appealing
                              compromise between classical and Bayesian methods. This chapter illus-
                              trates some of these well-known themes in the context of allele frequency
                              estimation and linkage analysis.


                              3.2 Newton’s Method

                                                                ˆ
                              In iterating toward a maximum point θ, Newton’s method and scoring rely
                              on quadratic approximations to the loglikelihood L(θ) of a model. To mo-
                              tivate Newton’s method, let us define the score dL(θ) to be the differential
                              or row vector of first partial derivatives of L(θ) and the observed infor-
                                         2
                              mation −d L(θ) to be the Hessian matrix of second partial derivatives
                              of −L(θ). A second-order Taylor’s expansion around the current point θ n
                              gives
                                                                   1        t 2
                                   L(θ)  ≈ L(θ n )+ dL(θ n )(θ − θ n )+ (θ − θ n ) d L(θ n )(θ − θ n ). (3.1)
                                                                   2
                              In Newton’s method, one maximizes the quadratic approximation on the
                              right of (3.1) by setting its gradient

                                                      t
                                                          2
                                                dL(θ n ) + d L(θ n )(θ − θ n )  = 0
                              and solving for the next iterate
                                                                             t
                                                               2
                                                θ n+1  = θ n − d L(θ n ) −1 dL(θ n ) .
                              Obviously, any stationary point of L(θ) is a fixed point of Newton’s algo-
                              rithm.
   51   52   53   54   55   56   57   58   59   60   61