Page 234 - Applied Probability

P. 234

10. Molecular Phylogeny
220
when codon site i is fast evolving. If the sites are numbered 1 through m,
then a Gibbs random ﬁeld assigns the prior probability
e
Pr(C = c)=
H(d)

e
d H(c)
to the random vector C =(C 1 ,...,C m ) using a potential function H(c).
One fruitful choice of H(c)is
m r m−j

c i + 1 {c i =c i+j } .
H(c)= θ 0 θ j
i=1 j=1 i=1
This multilevel logistic model [3] accounts for the proportion of slow evolv-
ing codon sites through the parameter θ 0 . Spatial correlation at a distance
j ≥ 1 is accounted for by the parameter θ j . The larger θ j is, the stronger
the correlation between codon sites i and i + j. Common sense suggests
that in most circumstances θ 1 ≥ θ 2 ≥· · · ≥ θ r ≥ 0. The special case r =1
is just the Ising model of statistical mechanics [21].
To compute the likelihood of the data under the rate variation model, we
m
follow Schadt and Lange [19], and let L c = denote the likelihood

i=1 L c i
of the data given the rate variation vector C = c =(c 1 ,...,c m ). The overall
likelihood is then
1 H(c)
L = L ce
e H(d)
d c
1 1 m r m−j
1
= ··· L c i e θ 0c i e θ j 1 {c i =c i+j } .
e H(d)
d c 1 =0 c m =0 i=1 j=1 i=1
Computation of either the partition function d e H(d) or the numerator
L ce of L reduces to the evaluation of a multiple sum of products of
H(c)
c
arrays. Once again we compute the multiple sum as an iterated sum. Con-
sider the numerator of L. In the forward algorithm, we eliminate the indices
c 1 , ··· ,c m in the indicated order. Starting with the array a 0 (c 1 ,...,c r )= 1
at step 0, at step i we create the new array a i (c i+1 ,... ,c min{i+r,m} )via
a i (c i+1 ,...,c min{i+r,m} )
1 min{r,m−i}

= L c i e θ 0 c i e θ j 1 {c i =c i+j } a i−1 (c i ,...,c min{i−1+r,m} )
c i =0 j=1
and discard the old array a i−1 (c i ,...,c min{i−1+r,m} ). The ﬁnal array a m
H(c)
has zero dimension and furnishes the numerator L ce .
c
In computing a i (c i+1 ,...,c min{i+r,m} ) it is advantageous to perform the
indicated array multiplications pairwise, starting with the two smallest ar-
e θ 0 c i , multiplying the resulting product against e θ 1 1 {c i =c i+1 } , and
rays L c i

229 230 231 232 233 234 235 236 237 238 239