Page 66 - Applied Probability

P. 66

3. Newton’s Method and Scoring
deduce this fact by applying the moment formula (3.13) in the conditional
density computation
Γ(α . )
2n

k
i=1
k
Γ(α . ) Γ(α i ) 2n n 1 ...n k k i=1 p n i +α i −1 49
n i +α i −1
"

k i=1 i
q dq
Γ(α i ) n 1 ...n k ∆ k
i=1
k
Γ(2n + α . ) n i +α i −1
= p .
k Γ(n i + α i ) i
i=1 i=1
The posterior mean (n i + α i )/(2n + α . ) is a strongly consistent, asymptot-
ically unbiased estimator of p i .
The primary drawback of being Bayesian in this situation is that there
is no obvious way of selecting a reasonable prior. However, if data from
several distinct populations are available, then one can select an appropriate
prior empirically. Consider the marginal distribution of the allele counts
t
(N 1 ,...,N k ) in a sample of genes from a single population. Integrating
t
out the prior on the allele frequency vector p =(p 1 ,...,p k ) yields the
predictive distribution [16]
Pr(N 1 = n 1 ,...,N k = n k )
k

2n Γ(α . ) Γ(n i + α i )
= . (3.14)
n 1 ··· n k Γ(2n + α . ) Γ(α i )
i=1
This distribution is known as the Dirichlet-multinomial distribution.
Its parameters are the α’s rather than the p’s.
With independent data from several distinct populations, one can esti-
t
mate the parameter vector α =(α 1 ,...,α k ) of the Dirichlet-multinomial
distribution by maximum likelihood. The estimated α can then be recycled
to compute the posterior means of the allele frequencies for the separate
populations. This interplay between frequentist and Bayesian techniques is
typical of the empirical Bayes method.
To estimate the parameter vector α characterizing the prior, we again
revert to Newton’s method. We need the score dL(α) and the observed
2
information −d L(α) for each population. Based on the likelihood (3.14),
elementary calculus shows that the score has entries
∂
L(α) = ψ(α . ) − ψ(2n + α . )+ ψ(n i + α i ) − ψ(α i ), (3.15)
∂α i
where ψ(s)= d ln Γ(s) is the digamma function [9]. The observed infor-
ds
mation has entries
∂ 2

− L(α) = −ψ (α . )+ ψ (2n + α . ) (3.16)
∂α i ∂α j

−1 {i=j} [ψ (n i + α i ) − ψ (α i )],

61 62 63 64 65 66 67 68 69 70 71