Page 184 - Applied Probability
P. 184
9
Descent Graph Methods
9.1 Introduction
Mapping disease and marker loci from pedigree phenotypes is one of the
most computationally onerous tasks in modern biology. Even tightly opti-
mized software can be quickly overwhelmed by the synergistic obstructions
of missing data, multiple marker loci, multiple alleles per marker locus,
and inbreeding. This unhappy situation has prompted mathematical and
statistical geneticists to formulate alternatives to the Elston-Stewart algo-
rithm. The most productive alternatives use elementary graph theory. The
Lander-Green-Kruglyak algorithm alluded to in Chapter 7 exploits gene
flow graphs and works well on small pedigrees. For large pedigrees, it is
helpful to combine the graph theory perspective with stochastic methods
of numerical integration [12, 23, 24, 32, 39, 40, 42].
One of the advantages of descent graph methods is that they enable us
to ask questions about genetic identity by descent. This has implications
for computing nonparametric linkage statistics and conditional kinship co-
efficients. Although the deterministic and stochastic algorithms share a
common core, they diverge in several details. The Lander-Green-Kruglyak
algorithm makes an interesting detour into Fourier analysis. The stochastic
methods can be viewed as part of the Markov chain Monte Carlo (MCMC)
revolution sweeping statistics. The Metropolis algorithm and Gibbs sam-
pling make it straightforward to construct a Markov chain sampling from
a complicated conditional distribution [7, 8, 10, 13, 28, 38]. Once a sam-
ple is available, then any conditional expectation can be approximated by
forming its corresponding sample average. The implications of this insight
are profound for both classical and Bayesian statistics. As a bonus, trivial
changes to the Metropolis algorithm yield simulated annealing, a general-
purpose algorithm for solving difficult combinatorial optimization problems
such as haplotyping [17, 31].
The agenda for this rather long chapter is to (a) review the existing
theory of finite-state Markov chains, (b) briefly explain the Metropolis al-
gorithm and simulated annealing, (c) apply these ideas to the computation
of location scores and the reconstruction of haplotypes, (d) develop the
Walsh transform and Baum’s algorithm standing behind the Lander-Green-
Kruglyak algorithm, and (e) show how these techniques fit in computing
nonparametric linkage statistics, conditional kinship coefficients, and error
probabilities in genotyping.