Page 133 - Applied Probability
P. 133

7. Computation of Mendelian Likelihoods
                              all consolidated intervals of the corresponding recombination fractions θ or
                              of their complements 1−θ, depending on whether the gamete shows recom-
                                                                        1
                                                                          accounts for the parental
                              bination on a given interval or not. The factor of
                                                                        2
                              chromosome chosen for the first locus. In the exceptional case where there
                              are no heterozygous loci, the gamete transmission probability is 1. If there
                                                                                             1
                              is only one heterozygous locus, the gamete transmission probability is 117 .
                                                                                             2
                              Recombination fractions for consolidated intervals can be computed via
                              Trow’s formula as described in Problem 1.
                                The likelihood L of a pedigree with n people can now be assembled from
                              these component parts. Let the ith person have phenotype X i and possible
                              genotype G i . Conditioning on the genotypes of each of the n people yields
                              Ott’s [27] representation of the likelihood

                               L =      ···   Pr(X 1 ,...,X n | G 1 ,... ,G n )Pr(G 1 ,... ,G n )
                                     G 1   G n

                                  =     ···      Pen(X i | G i )Pr(G 1 ,...,G n )          (7.1)
                                               i
                                     G 1   G n

                                  =     ···      Pen(X i | G i )  Prior(G j )  Tran(G m | G k ,G l ),
                                               i             j
                                     G 1   G n                          {k,l,m}
                              where the product on j is taken over all founders and the product on
                              {k, l, m} is taken over all parent–offspring triples.
                                At this point, several comments are appropriate concerning the explicit
                              likelihood representation (7.1). First, ranges of summation for the geno-
                              types are not specified. At the very least it is profitable to eliminate any
                              genotype G i with Pen(X i | G i ) = 0. We will discuss later an algorithm
                              for genotype elimination that performs much better than this naive tac-
                              tic in most circumstances. Second, the notation in (7.1) does not make it
                              clear whether the likelihood L should be computed as a joint sum or as
                              an iterated sum. One can argue rigorously that an iterated sum is always
                              preferable to a joint sum if minimizing counts of additions and multiplica-
                              tions is taken as a criterion [18]. Viewing (7.1) as an iterated sum opens
                              up the possibility of rearranging the order of summation so as to achieve
                              the most efficient computation. Third, calculation of L is numerically sta-
                              ble since only additions and multiplications of nonnegative numbers are
                              involved. There will be no disastrous roundoff errors due to subtraction
                              of quantities of similar magnitude. However, serious underflows can be en-
                              countered because all terms are usually probabilities and hence lie in the
                              interval [0, 1]. Underflows can be successfully defused by repeated rescaling
                              and reporting the final answer as a loglikelihood. Last of all, the various
                              terms in (7.1) can be viewed as values taken on by arrays. For instance,
                              Pen(X i | G i ) is an array of rank 1 that depends on the possible genotypes
                              G i for i. Similarly, Tran(G k | G i ,G j ) is an array of rank 3 depending on G i ,
                              G j , and G k jointly. Thus, computation of L is inherently array-oriented.
   128   129   130   131   132   133   134   135   136   137   138