Page 137 - Applied Probability
P. 137

7. Computation of Mendelian Likelihoods
                                                                                            121
                              same number of operations when m 3 = 1 and strictly fewer operations
                              when m 3 > 1. If we take the alternative order (3, 1, 2), the total operations
                              are m 2 (m 1 + m 3 +1) − 1. Thus, the order (3, 1, 2) is even better than the
                              original order (3, 2, 1).
                                Note that when (7.2) is computed as an iterated sum, arrays are con-
                              stantly being created and discarded. At each summation, those arrays de-
                              pending on the current summation index are multiplied together, and the
                              resulting product array is summed on this index. This process leads to a
                              new array no longer depending on the eliminated index, and those arrays
                              participating in the formation of the new array can now be discarded. Each
                              summation therefore transforms the original computational problem into
                              a problem of the same sort, except that the number of indices is reduced
                              by one. Eventually, all indices are eliminated, and the original problem is
                              solved.
                                Finding an optimal or nearly optimal summation sequence is highly non-
                              trivial. In genetics problems, one can attempt to generate such sequences
                              by working from the periphery of the pedigree inward. Such pruning of the
                              pedigree succeeds for graphically simple pedigrees. However, in the pres-
                              ence of inbreeding, cycles or loops in the graphical structure of the pedigree
                              impede this approach. Furthermore, a purely graphical treatment ignores
                              the important differences in the number of genotypes per person. A detailed
                              analysis of this problem is carried out in [9].
                                Greedy algorithms provide useful heuristics for choosing a nearly opti-
                              mal summation sequence. For instance, we can always sum on that index
                              requiring the fewest current arithmetic operations to eliminate. Thus, in
                              our toy example, we would start with index 1 or 3 depending on whether
                              m 1 <m 3 or m 1 >m 3 . A tie m 1 = m 3 is broken arbitrarily. This greedy
                              heuristic is not always optimal, as Problem 3 indicates.
                                Another context where greedy algorithms arise naturally is in the for-
                              mation of array products. Consider, for instance, equation (7.3). If we first
                              multiply array B times array D to get

                                                   F(G 2 )  = B(G 2 )D(G 2 ),
                              and then multiply and sum to form


                                               E(G 1 )=        A(G 1 ,G 2 )F(G 2 ),
                                                          G 2 ∈S 2
                              we save m 1 m 2 − m 2 multiplications. This example illustrates that arrays
                              should always be multiplied pairwise until only two arrays involving the
                              current summation index survive. These last two arrays can then be mul-
                              tiplied and afterwards summed, or they can be simultaneously multiplied
                              and summed. The latter method generalizes matrix multiplication; it entails
                              the same amount of arithmetic but requires less storage than the former
                              method. In forming pairwise products of intermediate arrays, we face the
   132   133   134   135   136   137   138   139   140   141   142