Page 92 - Applied Probability
P. 92
4. Hypothesis Testing and Categorical Data
transmission of the disease and marker genes, Badner et al. [3] show
that T has mean and variance
1 s s
&
s( )
s
2
E(T)=
1 s−1 s−1
2
s odd
s( )
s−1
2
2 s even 75
2
Var(T)= s − E(T) .
Prove these formulas. If there are n such parents (usually two per
family), and the ith parent has statistic T i , then the overall statistic
n
i=1 [T i − E(T i )]
n
i=1 Var(T i )
should be approximately standard normal. A one-sided test is ap-
propriate because the T i tend to increase in the presence of linkage
between the marker locus and a disease predisposing locus. (Hint:
The identities
s
s
2 −1
s s−1 s
=2 − 2
i 2
i=0
s
2 −1
s s−2 s − 1
i = s 2 − s
i
i=0 2
for s even and similar identities for s odd are helpful.)
8. To compute moments under the Fisher-Yates distribution (4.4), let
u(u − 1) ··· (u − r +1) r> 0
u r =
1 r =0
be a falling factorial power, and let {l i } be a collection of nonnegative
l
integers indexed by the haplotypes i =(i 1 ...,i m ). Setting l = i i
and l jk = i 1 {i j =k} l i , show that
m l
l j=1 k (n jk ) jk
E n i = .
i (n )
l m−1
i
m n ji j
In particular, verify that E(n i )= n .
j=1 n
9. Verify the mean and variance expressions in equation (4.6) using
Problem 8. Alternatively, write c 1j as a sum of indicator random
variables and calculate the mean and variance directly. Check that
the two methods give the same answer. (Hints: In applying Problem
8, i has two components. Set all but one of the l i equal to 0. Set the
remaining one equal to 1 or 2 to get either a first or second factorial
moment. The kth indicator random variable indicates whether the
kth person is a case and has genotype j.)