Page 84 - Applied Probability
P. 84

4. Hypothesis Testing and Categorical Data
                                                                                             67
                              function that mutations in these amino acids are immediately eliminated
                              by evolution.
                              4.6 Exact Tests of Independence
                              The problem of testing linkage equilibrium is equivalent to a more gen-
                              eral statistical problem of testing for independence in contingency tables.
                              To translate into the usual statistical terminology, one need only equate
                              “locus” to “factor,” “allele” to “level,” and “linkage equilibrium” to “inde-
                              pendence.” In exact inference, one conditions on the marginal counts of a
                              contingency table. In the linkage equilibrium setting, this means condition-
                              ing on the allele counts at each locus. Suppose we sample n independent
                              haplotypes defined on m loci. Recall that a haplotype i =(i 1 ,... ,i m )is
                              just an m-tuple of allele choices at the participating loci. If the frequency
                              of allele k at locus j is p jk , then under linkage equilibrium the haplotype
                              i =(i 1 ,...,i m) has probability
                                                            m

                                                       p i =   p ji j  ,
                                                           j=1
                              and the haplotype counts {n i } from the sample follow a multinomial dis-
                              tribution with parameters (n, {p i }). The marginal allele counts {n jk } at
                              any locus j likewise follow a multinomial distribution with parameters
                              (n, {p jk }). Since under the null hypothesis of linkage equilibrium, marginal
                              counts are independent from locus to locus, the conditional distribution of
                              the haplotype counts is

                                                                     n     p n i

                                          Pr({n i }|{n jk })=       {n i }  i  i
                                                                m     n       (p jk ) n jk

                                                                j=1 {n jk }  k
                                                                   n

                                                          =     m  {n i }  n   .           (4.4)

                                                                j=1 {n jk }
                              One of the pleasant facts of exact inference is that the multivariate Fisher-
                              Yates distribution (4.4) does not depend on the unknown allele frequen-
                              cies. Problem 8 indicates how to compute its moments [23].
                                We can also derive the Fisher-Yates distribution by a counting argument
                              involving a sample space distinct from the space of haplotype counts. Con-
                              sider an m × n matrix whose rows correspond to loci and whose columns
                              correspond to haplotypes. At locus j there are n genes with n jk genes rep-
                              resenting allele k. If we uniquely label each of these n genes, then there are
                              n! distinguishable permutations of the genes in row j. The uniform sample
                              space consists of the (n!) m  matrices derived from the n! permutations of
                                                                                         m
                              each of the m rows. Each such matrix is assigned probability 1/(n!) .For
   79   80   81   82   83   84   85   86   87   88   89