Page 87 - Applied Probability
P. 87
4. Hypothesis Testing and Categorical Data
70
The levels for disease status are case and control. The levels for genotype
are the various observed genotypes among either cases or controls. One
can use multilocus genotypes rather than single-locus genotypes and, when
available, haplotypes rather than genotypes. The use of haplotypes doubles
sample size and leads to smaller tables with less sparsity. Because cases
often differ from controls in the overabundance of one or two genotypes
(or haplotypes), it is desirable to implement a test that is sensitive to such
departures. A variation on the Z max test is nearly ideal in this regard.
Consider the standardized residuals
c ij − E(c ij )
Z ij = ,
Var(c ij )
where c 1j is the number of times genotype j appears among cases and c 2j
is the number of times genotype j appears among controls. The statistic
Z max = max i,j Z ij simplifies to Z max = max 1,j |Z ij | because Z 2j = −Z 1j .
Permutation of case-control labels offer the opportunity of approximating
the distribution of this statistic. Problems 8 and 9 give the mean and
variance of c 1j as
c 1. c .j
E(c 1j ) = (4.6)
n
c 1. (c 1. − 1)c .j (c .j − 1) 2
Var(c 1j ) = +E(c 1j ) − E(c 1j ) ,
n(n − 1)
where c 1. is the number of cases, c .j is the number of times genotype j
appears among both cases and controls, and n is the number of cases plus
the number of controls. The marginal sums c 1. and c .j are the analogs of
the marginal allele counts n jk in the linkage equilibrium problem.
Example 4.7.1 Exact Treatment of the ABO Ulcer Data
The ABO ulcer data of Table 4.1 provide a chance to compare the various
test statistics. The permutation version of Fisher’s exact test and the Z max
test give p-values of 0.0335 ± 0.0036 and 0.0169 ± 0.0026, respectively, for
10,000 permutations. As anticipated, the Z max statistic attains its maxi-
mum for genotype O. These results compare well with the p-value of 0.0295
for the likelihood ratio test and suggest that the Z max statistic possesses
somewhat greater power than the other two statistics for detecting depar-
tures in a single genotype.
4.8 The Transmission/Disequilibrium Test
Example 4.2.1 on the association between the ABO system and duodenal
ulcer depended on detecting a difference in allele frequencies between pa-
tients and normal controls. In a racially homogeneous population like that