Page 123 - Applied Probability
P. 123
G j match in state. Although substituting i.b.d. for identity by state might
be attractive in this definition, the alternative statistic with i.b.d. matches
counted would be considerably more difficult to evaluate. In any event, if
person i has observed genotype M i = a k /a l and person j has genotype
M j = a m /a n, then definition (6.7) reduces to
1
1 6. Applications of Identity Coefficients 107
= 1 {a k =a m } f(p k )+ 1 {a k =a n } f(p k )
Z ij
4 4
1 1
+ 1 {a l =a m } f(p l )+ 1 {a l =a n} f(p l ).
4 4
From the pairwise statistics Z ij , we form an overall statistic Z = Z ij
{i,j}
by summing over all affected pairs {i, j} typed in the pedigree. In most
applications we take i = j, but the contrary procedure of comparing an
affected person to himself can be useful for inbred affecteds if the disease
is thought to be caused by recessively acting genes.
Since the mean and variance of Z obviously are
E(Z) = E(Z ij )
{i,j}
Var(Z) = Cov(Z ij ,Z kl ),
{i,j} {k,l}
it suffices to calculate E(Z ij ) and Cov(Z ij ,Z kl ). If we condition on whether
the two sampled genes G i and G j are i.b.d., then it follows that
)]
E(Z ij )=E[1 {G i =G j } f(p G i
2
f(p k )p k +(1 − Φ ij ) f(p k )p .
=Φ ij
k
k k
The covariance Cov(Z ij ,Z kl )=E(Z ij Z kl ) − E(Z ij )E(Z kl ) can be com-
) depends only on the observed
puted by first noting that 1 {G i =G j } f(p G i
) depends only on
marker genotypes M i and M j and that 1 {G k =G l } f(p G k
the observed marker genotypes M k and M l . These two facts imply that
E(Z ij Z kl )
) | M k ,M l )]
=E[E(1 {G i =G j } f(p G i ) | M i ,M j )E(1 {G k =G l } f(p G k
) | M i ,M j ,M k ,M l )]
=E[E(1 {G i =G j } f(p G i )1 {G k =G l } f(p G k
)].
=E[1 {G i=G j } f(p G i )1 {G k =G l } f(p G k
To evaluate the last expectation, we condition on how the four sam-
pled genes G i , G j , G k , and G l are partitioned under identity by descent.
Consider again the condensed identity states of Figure 5.3. In each state,
imagine genes G i and G j appearing on the top row in no particular order
and genes G k and G l appearing on the bottom row in no particular order.