Page 217 - Applied Probability
P. 217

10
                              Molecular Phylogeny
                              10.1 Introduction

                              Inferring the evolutionary relationships among related taxa (species, gen-
                              era, families, or higher groupings) is one of the most fascinating problems of
                              molecular genetics [17, 22, 23]. It is now relatively simple to sequence genes
                              and to compare the results from several contemporary taxa. In the current
                              chapter we will assume that the chore of aligning the DNA sequences from
                              these taxa has been successfully accomplished. The taxa are then arranged
                              in an evolutionary tree (or phylogeny) depicting how taxa diverge from
                              common ancestors. A single ancestral taxon roots the binary tree describ-
                              ing the evolution of the contemporary taxa. The reconstruction problem
                              can be briefly stated as finding the rooted evolutionary tree best fitting the
                              current DNA data. Once the best tree is identified, it is also of interest to
                              estimate the branch lengths of the tree. These tell us something about the
                              pace of evolution. For the sake of brevity, we will focus on the problem of
                              finding the best tree.
                                It is worth emphasizing that molecular phylogeny is an area of intense
                              current research. Most of the models applied are caricatures of reality. Be-
                              sides the dubious assumption that alignment is perfect, the models fail to
                              handle site-to-site variation in the rate of evolution, correlation in the evolu-
                              tion of neighboring sites, and sequence variation within a taxon. Evolution-
                              ary biologists tend take the attitude that it is necessary to start somewhere
                              and that a failure to account for details will not distort overall patterns if
                              the patterns are sufficiently obvious. Mathematical biology abounds with
                              compromises of this sort. However, better models can answer more subtle
                              questions. Scientific attention is now shifting to identifying gene families,
                              sequence motifs, and conserved regions within genes. The final sections of
                              this chapter deal with codon models and spatial correlation in the rate of
                              evolution. These modeling elaborations have the potential of shedding light
                              on protein structure and function.



                              10.2 Evolutionary Trees


                              An evolutionary tree is a directed graph showing the relationships be-
                              tween a group of contemporary taxa and their hypothetical common ances-
                              tors. The root of the tree is the common ancestor of all of the contempo-
                              rary taxa. The other nodes are either the contemporary taxa at the tips of
   212   213   214   215   216   217   218   219   220   221   222