Page 93 - Applied Probability

P. 93

4. Hypothesis Testing and Categorical Data
76
10. A geneticist phenotypes n unrelated people at each of m loci with
codominant alleles and records a vector i =(i 1 /i ,...,i m/i )of
∗
∗
1
m
genotypes for each person. Because phase is unknown, i cannot be
resolved into two haplotypes. The data gathered can be summarized
by the number of people n i counted for each genotype vector i. Let n jk
be the number of alleles of type k at locus j observed in the sample,
and let n h be the total number of heterozygotes observed over all
loci. Assuming genetic equilibrium, prove that the distribution of the
counts {n i } conditional on the allele totals {n jk } is
n

2 n h
{n i }
Pr({n i }|{n jk })= . (4.10)
m 2n

j=1 {n jk }
The moments of the distribution (4.10) are computed in [24]; just as
with haplotype count data, all allele frequencies cancel.
11. Describe and program an eﬃcient algorithm for generating random
permutations of the set {1,...,n}. How many calls of a random num-
ber generator are involved? How many interchanges of two numbers?
You might wish to compare your results to the algorithm in [29].
12. Describe and program a permutation version of the two-sample t-test.
Compare it on actual data to the standard two-sample t-test.
4.10 References
[1] Agresti A (1992) A survey of exact inference for contingency tables.
Stat Sci 7:131–177

[2] Allison DB, Heo M, Kaplan N, Martin ER (1999) Sibling-based tests
of linkage and association for quantitative traits. Amer J Hum Genet
64:1754–1763

[3] Badner JA, Chakravarti A, Wagener DK (1984) A test of nonrandom
segregation. Genetic Epidemiology 1:329–340
[4] Barbour AD, Holst L, Janson S (1992) Poisson Approximation. Oxford
University Press, Oxford

[5] Boehnke M, Langefeld CD (1998) Genetic association mapping based
on discordant sib pairs: the discordant-alleles test. Amer J Hum Genet
62:950–961

[6] Cavalli-Sforza LL, Bodmer WF (1971) The Genetics of Human Pop-
ulations. Freeman, San Francisco

88 89 90 91 92 93 94 95 96 97 98