Page 92 - Applied Probability
P. 92

4. Hypothesis Testing and Categorical Data
                                   transmission of the disease and marker genes, Badner et al. [3] show
                                   that T has mean and variance
                                                                1 s s

                                                            &
                                                              s( )
                                                                    s
                                                                2
                                                  E(T)=
                                                                1 s−1 s−1
                                                                    2
                                                                            s odd
                                                              s( )
                                                                      s−1
                                                                2
                                                                       2     s even          75
                                                                   2
                                                Var(T)= s − E(T) .
                                   Prove these formulas. If there are n such parents (usually two per
                                   family), and the ith parent has statistic T i , then the overall statistic
                                                          n
                                                          i=1 [T i − E(T i )]
                                                             n

                                                             i=1  Var(T i )
                                   should be approximately standard normal. A one-sided test is ap-
                                   propriate because the T i tend to increase in the presence of linkage
                                   between the marker locus and a disease predisposing locus. (Hint:
                                   The identities
                                                   s
                                                                        s
                                                   2  −1
                                                   	    s        s−1    s
                                                            =2      −   2
                                                        i              2
                                                   i=0
                                                  s
                                                  2  −1
                                                  	     s          s−2   s − 1
                                                      i     = s 2     −    s
                                                        i
                                                   i=0                     2
                                   for s even and similar identities for s odd are helpful.)
                                 8. To compute moments under the Fisher-Yates distribution (4.4), let
                                                          u(u − 1) ··· (u − r +1) r> 0
                                               u r  =
                                                        1                     r =0
                                   be a falling factorial power, and let {l i } be a collection of nonnegative
                                                                                              l
                                   integers indexed by the haplotypes i =(i 1 ...,i m ). Setting l =    i i
                                   and l jk =    i  1 {i j =k} l i , show that
                                                                   m          l
                                                         l         j=1  k (n jk )  jk

                                                  E     n  i  =                 .
                                                         i           (n )
                                                                       l m−1
                                                      i
                                                                    m   n ji j
                                   In particular, verify that E(n i )= n   .
                                                                    j=1  n
                                 9. Verify the mean and variance expressions in equation (4.6) using
                                   Problem 8. Alternatively, write c 1j as a sum of indicator random
                                   variables and calculate the mean and variance directly. Check that
                                   the two methods give the same answer. (Hints: In applying Problem
                                   8, i has two components. Set all but one of the l i equal to 0. Set the
                                   remaining one equal to 1 or 2 to get either a first or second factorial
                                   moment. The kth indicator random variable indicates whether the
                                   kth person is a case and has genotype j.)
   87   88   89   90   91   92   93   94   95   96   97