Page 65 - Applied Probability
P. 65

3. Newton’s Method and Scoring
                              48
                              It follows that the density function of Z is

                                                   k

                                                      α i −1
                                                     y
                                                               k


                                                      i
                                                                 α i −1 −s
                                                            s
                                                                     e
                                                                        .
                                                               i=1
                                                     Γ(α i )
                                                  i=1
                                                                                 t
                              Integrating out the variable s, we find that (Y 1 ,... ,Y k−1 ) has density
                                                               k
                                                      Γ(α . )     α i −1
                                                                 y    ,                   (3.12)
                                                      k           i
                                                     i=1  Γ(α i )  i=1
                                           k
                              where α . =     α i . It is more convenient to think of the density (3.12) as
                                           i=1
                              applying to the whole random vector Y . From this perspective, the density
                              exists relative to the uniform measure on the unit simplex
                                                                          k
                                                                         	        !
                                                      t
                                      ∆ k = (y 1 ,...,y k ) : y 1 > 0,... ,y k > 0,  y i =1 .
                                                                         i=1
                                Once the density (3.12) is in hand, the elegant moment formula
                                            k                           k
                                                            Γ(α . )        m i +α i −1
                                         E     Y  m i  =                  y       dy
                                                i          k               i
                                            i=1            i=1  Γ(α i )  ∆ k i=1
                                                                    k
                                                            Γ(α . )     Γ(m i + α i )
                                                      =                                   (3.13)
                                                         Γ(m . + α .)    Γ(α i )
                                                                   i=1
                              follows immediately from the fact that the density has total mass 1. The
                              moment formula (3.13) and the factorial property Γ(t +1) = tΓ(t) of the
                              gamma function together yield the mean E(Y i )= α i /α ..
                              3.7 Empirical Bayes Estimation of Allele
                                     Frequencies
                              Consider a locus with k codominant alleles. If in a sample of n people
                              allele i appears n i times, then the maximum likelihood estimate of the ith
                              allele frequency is n i /(2n). This classical estimate based on the multinomial
                              distribution can be contrasted to a Bayes estimate using a Dirichlet prior
                              for the allele frequencies p 1 ,...,p k [13].
                                The Dirichlet prior is a conjugate prior for the multinomial distribution
                                                                                           t
                              [14]. This means that if the allele frequency vector p =(p 1 ,...,p k ) has
                              a Dirichlet prior with parameters α 1 ,... ,α k , then taking into account the
                              data, p has a Dirichlet posterior with parameters n 1 + α 1 ,...,n k + α k .We
   60   61   62   63   64   65   66   67   68   69   70