Page 249 - Applied Probability
P. 249
11. Radiation Hybrid Mapping
236
probabilities under this order must be substituted in the above calculations.
The next section addresses maximum likelihood estimation.
11.4 Maximum Likelihood Methods
The disadvantage of the minimum obligate breaks criterion is that it pro-
vides neither estimates of physical distances between loci nor comparisons
of likelihoods for competing orders. Maximum likelihood obviously reme-
dies the latter two defects, but does so at the expense of introducing some
of the explicit assumptions mentioned earlier. We will now briefly discuss
how likelihoods are computed and maximized for a given order. Different
orders can be compared on the basis of their maximum likelihoods.
Because different clones are independent, it suffices to demonstrate how
to compute the likelihood of a single clone. Let X =(X 1 ,... ,X m ) be the
observation vector for a clone potentially typed at m loci. The component
X i is defined as 0, 1, or ?, depending on what is observed at the ith locus.
We can gain a feel for how to compute the likelihood of X by considering
two simple cases. If m = 1 and X 1 = ?, then X 1 follows the Bernoulli
distribution
i
Pr(X 1 = i)= r (1 − r) 1−i (11.4)
for retention or nonretention. When m = 2 and both loci are typed, the
likelihood must reflect breakage as well as retention. If θ is the probability
of at least one break between the two loci, then
Pr(X 1 =0,X 2 =0) = (1 − r)(1 − θr)
Pr(X 1 =1,X 2 = 0) = Pr(X 1 =0,X 2 =1)
=(1 − r)θr (11.5)
Pr(X 1 =1,X 2 =1) = 1 − 2(1 − r)θr − (1 − r)(1 − θr)
=(1 − θ + θr)r.
Note that we parameterize in terms of the breakage probability θ between
the two loci rather than the physical distance δ between them. Besides the
obvious analytical simplification entailed in using θ, only the product λδ
can be estimated anyway. The parameters λ and δ cannot be separately
identified.
As noted earlier, the probability of an obligate break between the two
loci is 2r(1 − r)θ, in agreement with the calculated value
Pr(X 1 = X 2 ) = Pr(X 1 =1,X 2 = 0)+Pr(X 1 =0,X 2 =1)
from (11.5). It is natural to estimate r and Pr(X 1 = X 2 ) by their empirical
values. Given these estimates, one can then estimate θ via the identity
Pr(X 1 = X 2 )
θ = .
2r(1 − r)