Page 56 - Applied Probability
P. 56
3
Newton’s Method and Scoring
3.1 Introduction
This chapter explores some alternatives to maximum likelihood estimation
by the EM algorithm. Newton’s method and scoring usually converge
faster than the EM algorithm. However, the trade-offs of programming
ease, numerical stability, and speed of convergence are complex, and sta-
tistical geneticists should be fluent in a variety of numerical optimization
techniques for finding maximum likelihood estimates. Outside the realm of
maximum likelihood, Bayesian procedures have much to offer in small to
moderate-sized problems. For those uncomfortable with pulling prior distri-
butions out of thin air, empirical Bayes procedures can be an appealing
compromise between classical and Bayesian methods. This chapter illus-
trates some of these well-known themes in the context of allele frequency
estimation and linkage analysis.
3.2 Newton’s Method
ˆ
In iterating toward a maximum point θ, Newton’s method and scoring rely
on quadratic approximations to the loglikelihood L(θ) of a model. To mo-
tivate Newton’s method, let us define the score dL(θ) to be the differential
or row vector of first partial derivatives of L(θ) and the observed infor-
2
mation −d L(θ) to be the Hessian matrix of second partial derivatives
of −L(θ). A second-order Taylor’s expansion around the current point θ n
gives
1 t 2
L(θ) ≈ L(θ n )+ dL(θ n )(θ − θ n )+ (θ − θ n ) d L(θ n )(θ − θ n ). (3.1)
2
In Newton’s method, one maximizes the quadratic approximation on the
right of (3.1) by setting its gradient
t
2
dL(θ n ) + d L(θ n )(θ − θ n ) = 0
and solving for the next iterate
t
2
θ n+1 = θ n − d L(θ n ) −1 dL(θ n ) .
Obviously, any stationary point of L(θ) is a fixed point of Newton’s algo-
rithm.