Page 65 - Applied Probability
P. 65
3. Newton’s Method and Scoring
48
It follows that the density function of Z is
k
α i −1
y
k
i
α i −1 −s
s
e
.
i=1
Γ(α i )
i=1
t
Integrating out the variable s, we find that (Y 1 ,... ,Y k−1 ) has density
k
Γ(α . ) α i −1
y , (3.12)
k i
i=1 Γ(α i ) i=1
k
where α . = α i . It is more convenient to think of the density (3.12) as
i=1
applying to the whole random vector Y . From this perspective, the density
exists relative to the uniform measure on the unit simplex
k
!
t
∆ k = (y 1 ,...,y k ) : y 1 > 0,... ,y k > 0, y i =1 .
i=1
Once the density (3.12) is in hand, the elegant moment formula
k k
Γ(α . ) m i +α i −1
E Y m i = y dy
i k i
i=1 i=1 Γ(α i ) ∆ k i=1
k
Γ(α . ) Γ(m i + α i )
= (3.13)
Γ(m . + α .) Γ(α i )
i=1
follows immediately from the fact that the density has total mass 1. The
moment formula (3.13) and the factorial property Γ(t +1) = tΓ(t) of the
gamma function together yield the mean E(Y i )= α i /α ..
3.7 Empirical Bayes Estimation of Allele
Frequencies
Consider a locus with k codominant alleles. If in a sample of n people
allele i appears n i times, then the maximum likelihood estimate of the ith
allele frequency is n i /(2n). This classical estimate based on the multinomial
distribution can be contrasted to a Bayes estimate using a Dirichlet prior
for the allele frequencies p 1 ,...,p k [13].
The Dirichlet prior is a conjugate prior for the multinomial distribution
t
[14]. This means that if the allele frequency vector p =(p 1 ,...,p k ) has
a Dirichlet prior with parameters α 1 ,... ,α k , then taking into account the
data, p has a Dirichlet posterior with parameters n 1 + α 1 ,...,n k + α k .We