Page 260 - Applied Probability
P. 260

to its left argument. In effect, we update γ by one step of Newton’s method
                              applied to the function Q(γ | γ old )+ R(γ), keeping in mind the identity
                                10
                              d Q(γ old | γ old )= dL(γ old ) proved in Problem 9 of Chapter 2.
                                All of the terms appearing in (11.19) are straightforward to evaluate. For
                              instance, taking into account relation (11.1) with λ =1, we have
                                                               ∂
                                                  ∂            11. Radiation Hybrid Mapping  247
                                                                     dθ i
                                                    L(γ)=        L(γ)
                                                 ∂δ i         ∂θ i   dδ i
                                                               ∂
                                                          =      L(γ)(1 − θ i ).
                                                              ∂θ i
                              Differentiation of the log prior produces
                                                ∂                     1
                                                  R(γ)= −
                                               ∂δ i           d − δ 1 −· · · − δ m−1
                                                ∂
                                                  R(γ)= 0
                                                ∂r
                                                                t
                                                    2
                                                 −d R   =(dR) dR.
                                                                   20
                                Computation of the diagonal matrix d Q(γ | γ) is more complicated.
                              Let N i be the random number of chromosomes in the sample with breaks
                              between loci i and i + 1. As noted earlier, this random variable has a
                              binomial distribution with success probability θ i and cH trials. Because
                              of the nature of the complete data likelihood, it follows that modulo an
                              irrelevant constant,
                                        Q(γ | γ old)
                                    =E(N i | obs,γ old)ln(θ i ) + E([cH − N i ] | obs,γ old ) ln(1 − θ i ).

                              Straightforward calculations show that
                                           ∂ 2                E(N i | obs,γ old )(1 − θ i )
                                          ∂δ 2  Q(γ | γ old )= −        θ 2        .
                                            i                           i
                                 ˜
                              If θ new,i is the EM update of θ i ignoring the prior, then as remarked pre-
                                                         ˜
                              viously, E(N i | obs,γ old)= cHθ new,i .
                                It is possible to simplify the EM gradient update (11.19) by applying the
                              Sherman-Morrison formula discussed in Chapter 3. In the present context,
                                                        t −1
                              we need to compute (A + uu )  v for the diagonal matrix
                                                             20
                                                   A = −d Q(γ old | γ old )
                                                             t
                              and the vectors u = dR(γ old ) and v = dL(γ old )+ dR(γ old). Because R(γ)
                                                                       ∂
                              does not depend on r, the partial derivative  ∂r R(γ) vanishes. Thus, the
                                            t
                              matrix A + uu is block diagonal, and the EM gradient update for the
                              parameter r coincides with the EM gradient update for r ignoring the
   255   256   257   258   259   260   261   262   263   264   265