Page 240 - Applied Probability
P. 240
10. Molecular Phylogeny
226
respectively, where the constants c 1 ,... ,c 6 are same ones defined in
equations (10.12) through (10.15).
15. For the nucleotide substitution model of Section 10.5, verify in general
that the equilibrium distribution is
(δ + κ)+ δ(γ + λ)
π A =
(α + γ + + λ)(γ + δ + κ + λ)
α(δ + κ)+ κ(γ + λ)
=
π G
(α + γ + + λ)(γ + δ + κ + λ)
γ(δ + κ)+ σ(γ + λ)
π C =
(β + δ + κ + σ)(γ + δ + κ + λ)
λ(δ + κ)+ β(γ + λ)
= .
π T
(β + δ + κ + σ)(γ + δ + κ + λ)
16. There is an explicit formula for the equilibrium distribution of a
continuous-time Markov chain in terms of weighted in-trees [20]. To
describe this formula, we first define a directed graph on the states
1,... ,n of the chain. The vertices of the graph are the states of the
chain, and the arcs of the graph are ordered pairs of states (i, j)hav-
ing transition rates λ ij > 0. If it is possible to reach some designated
state k from every other state i, then a unique equilibrium distribu-
tion π =(π 1 ,... ,π n ) exists for the chain. Note that this reachability
condition is weaker than requiring that all states communicate.
The equilibrium distribution is characterized by defining certain sub-
graphs called in-trees. An in-tree T i to state i is a subgraph having
n − 1 arcs and connecting each vertex j = i to i by some directed
path. Ignoring orientations, an in-tree is graphically a tree; observing
orientations, all paths lead to i. The weight w(T i ) associated with
the in-tree T i is the product of the transition rates λ jk labeling the
various arcs (j, k) of the in-tree. For instance, in the nucleotide sub-
stitution chain, one in-tree to A has arcs (G,A), (C,A), and (T,C).
Its associated weight is δσ.
In general, the equilibrium distribution is given by
w(T i )
T i
π i = . (10.17)
w(T j )
j T j
The reachability condition implies that in-trees to state k exist and
consequently that the denominator in (10.17) is positive. The value
of the in-tree formula (10.17) is limited by the fact that in a Markov
chain with n states there can be as many as n n−2 in-trees to a given
state. Thus, in the nucleotide substitution model, there are 4 4−2 =16
in-trees to each state and 64 in-trees in all. If you are undeterred by