Page 217 - Applied Probability
P. 217
10
Molecular Phylogeny
10.1 Introduction
Inferring the evolutionary relationships among related taxa (species, gen-
era, families, or higher groupings) is one of the most fascinating problems of
molecular genetics [17, 22, 23]. It is now relatively simple to sequence genes
and to compare the results from several contemporary taxa. In the current
chapter we will assume that the chore of aligning the DNA sequences from
these taxa has been successfully accomplished. The taxa are then arranged
in an evolutionary tree (or phylogeny) depicting how taxa diverge from
common ancestors. A single ancestral taxon roots the binary tree describ-
ing the evolution of the contemporary taxa. The reconstruction problem
can be briefly stated as finding the rooted evolutionary tree best fitting the
current DNA data. Once the best tree is identified, it is also of interest to
estimate the branch lengths of the tree. These tell us something about the
pace of evolution. For the sake of brevity, we will focus on the problem of
finding the best tree.
It is worth emphasizing that molecular phylogeny is an area of intense
current research. Most of the models applied are caricatures of reality. Be-
sides the dubious assumption that alignment is perfect, the models fail to
handle site-to-site variation in the rate of evolution, correlation in the evolu-
tion of neighboring sites, and sequence variation within a taxon. Evolution-
ary biologists tend take the attitude that it is necessary to start somewhere
and that a failure to account for details will not distort overall patterns if
the patterns are sufficiently obvious. Mathematical biology abounds with
compromises of this sort. However, better models can answer more subtle
questions. Scientific attention is now shifting to identifying gene families,
sequence motifs, and conserved regions within genes. The final sections of
this chapter deal with codon models and spatial correlation in the rate of
evolution. These modeling elaborations have the potential of shedding light
on protein structure and function.
10.2 Evolutionary Trees
An evolutionary tree is a directed graph showing the relationships be-
tween a group of contemporary taxa and their hypothetical common ances-
tors. The root of the tree is the common ancestor of all of the contempo-
rary taxa. The other nodes are either the contemporary taxa at the tips of