Page 208 - Applied Probability
P. 208

9. Descent Graph Methods
                              each putative position of the trait locus, the observed marker phenotypes
                              determine the conditional probabilities of the different descent graphs at
                              the trait locus. A given descent graph partitions the set of genes of affected
                              people at the trait locus into blocks. Two genes belong to the same block
                              if and only if they are identical by descent. Good nonparametric linkage
                              statistics quantify the clustering of genes in such partitions.  193
                                In discussing possible statistics, it is useful to consider a generic partition
                              of genes into m identity by descent blocks B 1 ,...,B m .Ifblock B i contains
                              |B i | genes, then some appealing sharing statistics are:
                                                    T blocks  = m
                                                      T max  =   max |B i |
                                                                1≤i≤m
                                                                m
                                                                	   |B i |
                                                            =                             (9.15)
                                                     T pairs
                                                                      2
                                                                i=1
                                                                m

                                                       T all  =    |B i |!
                                                                i=1
                              Statistic T blocks  counts the number of blocks, T max records the maximum
                              number of genes within any block, and T pairs counts the number of pairs
                              of genes identical by descent over all blocks. Statistic T all  is a rapidly in-
                              creasing function of the size of the blocks [43]. A low value of T blocks  or a
                              high value of T max , T pairs ,or T all  indicates clustering.
                                Now suppose we have r affecteds in a pedigree. If we suspect dominant
                              disease inheritance, then in most cases there is only one disease gene per
                              affected. This suggests that we entertain the thought experiment of sam-
                              pling one trait gene from each affected before making any comparison. Let
                              i k be an indicator that is 0 when we sample a maternal gene of the kth
                              affected person and 1 when we sample a paternal gene. Given a descent
                              graph, the statistics T blocks  through T all  are all meaningful for the genes
                              indicated by the vector (i 1 ,...,i r ). Furthermore, the statistic
                                                T j dom  =  max T j [(i 1 ,...,i r )]
                                                          (i 1 ,...,i r )
                              is apt to be more informative of the sharing caused by dominant inheritance
                              than the statistic T j . For a recessive disease, there are two disease genes per
                              affected, and sampling seems counterproductive. A compromise between
                              these two extremes is to employ the averaged statistic
                                                           1     1
                                                        1
                                             T j add  =       ···   T j [(i 1 ,...,i r )]
                                                       2 r
                                                          i 1 =0  i r =0
                              designed for diseases with additive penetrances.
                                In practice, one takes the expected values of these nonparametric sta-
                              tistics conditional on the observed marker genotypes, the trait location,
   203   204   205   206   207   208   209   210   211   212   213