Page 56 - Applied Probability

P. 56

3
Newton’s Method and Scoring
3.1 Introduction

This chapter explores some alternatives to maximum likelihood estimation
by the EM algorithm. Newton’s method and scoring usually converge
faster than the EM algorithm. However, the trade-oﬀs of programming
ease, numerical stability, and speed of convergence are complex, and sta-
tistical geneticists should be ﬂuent in a variety of numerical optimization
techniques for ﬁnding maximum likelihood estimates. Outside the realm of
maximum likelihood, Bayesian procedures have much to oﬀer in small to
moderate-sized problems. For those uncomfortable with pulling prior distri-
butions out of thin air, empirical Bayes procedures can be an appealing
compromise between classical and Bayesian methods. This chapter illus-
trates some of these well-known themes in the context of allele frequency
estimation and linkage analysis.

3.2 Newton’s Method

ˆ
In iterating toward a maximum point θ, Newton’s method and scoring rely
on quadratic approximations to the loglikelihood L(θ) of a model. To mo-
tivate Newton’s method, let us deﬁne the score dL(θ) to be the diﬀerential
or row vector of ﬁrst partial derivatives of L(θ) and the observed infor-
2
mation −d L(θ) to be the Hessian matrix of second partial derivatives
of −L(θ). A second-order Taylor’s expansion around the current point θ n
gives
1 t 2
L(θ) ≈ L(θ n )+ dL(θ n )(θ − θ n )+ (θ − θ n ) d L(θ n )(θ − θ n ). (3.1)
2
In Newton’s method, one maximizes the quadratic approximation on the
right of (3.1) by setting its gradient

t
2
dL(θ n ) + d L(θ n )(θ − θ n ) = 0
and solving for the next iterate
t
2
θ n+1 = θ n − d L(θ n ) −1 dL(θ n ) .
Obviously, any stationary point of L(θ) is a ﬁxed point of Newton’s algo-
rithm.

51 52 53 54 55 56 57 58 59 60 61