Page 60 - Elements of Distribution Theory
P. 60
P1: JZP
052184472Xc02 CUNY148/Severini May 24, 2005 2:29
46 Conditional Distributions and Expectation
2.3 Conditional Distributions
Consider random variables X and Y. Suppose that Y is a discrete random variable taking
the values 0 and 1 with probabilities θ and 1 − θ, respectively, where 0 <θ < 1. From
elementary probability theory we know that the conditional probability that X ∈ A given
that Y = y is given by
Pr(X ∈ A, Y = y)
Pr(X ∈ A|Y = y) = , (2.1)
Pr(Y = y)
provided that y = 0, 1so that Pr(Y = y) > 0. Hence, for any set A, the conditional proba-
bility function Pr(X ∈ A|Y = y) satisfies the equation
Pr(X ∈ A) = Pr(X ∈ A, Y = 0) + Pr(X ∈ A, Y = 1)
= Pr(X ∈ A|Y = 0)Pr(Y = 0) + Pr(X ∈ A|Y = 1)Pr(Y = 1)
∞
= Pr(X ∈ A|Y = y) dF Y (y).
−∞
Furthermore, for any subset B of {0, 1},
Pr(X ∈ A, Y ∈ B) = Pr(X ∈ A, Y = y) = Pr(X ∈ A|Y = y)Pr(Y = y)
y∈B y∈B
= Pr(X ∈ A|Y = y) dF Y (y). (2.2)
B
Now suppose that Y has an absolutely continuous distribution and consider Pr(X ∈
A|Y = y). If the distribution of Y is absolutely continuous, then Pr(Y = y) = 0 for all y so
that (2.1) cannot be used as a definition of Pr(X ∈ A|Y = y). Instead, we use a definition
based on a generalization of (2.2).
Let (X, Y) denote a random vector, where X and Y may each be vectors, and let X × Y
denote the range of (X, Y). In general, the conditional distribution of X given Y = y is a
function q(A, y), defined for subsets A ⊂ X and elements y ∈ Y such that for B ⊂ Y
Pr(X ∈ A, Y ∈ B) = q(A, y) dF Y (y) (2.3)
B
where F Y denotes the marginal distribution function of Y and such that for each fixed
y ∈ Y, q(·, y) defines a probability distribution on X. The quantity q(A, y) will be denoted
by Pr(X ∈ A|Y = y).
Example 2.8 (Two-dimensional discrete random variable). Let (X, Y) denote a two-
dimensional discrete random variable with range
{1, 2,..., m}×{1, 2,..., n}.
For each i = 1, 2,..., m let
q i (y) = Pr(X = i|Y = y).
Then, according to (2.3), q 1 (y),..., q m (y) must satisfy
Pr(X = i, Y = j) = q i ( j)Pr(Y = j)
for each i = 1,..., m and j = 1,..., n.