Page 61 - Elements of Distribution Theory

P. 61

P1: JZP
052184472Xc02 CUNY148/Severini May 24, 2005 2:29

2.3 Conditional Distributions 47

Hence, if Pr(Y = j) > 0, then
Pr(X = i, Y = j)
Pr(X = i|Y = j) = ;
Pr(Y = j)
if Pr(Y = j) = 0, Pr(X = i|Y = j) may be taken to have any ﬁnite value.

Example 2.9 (Independent random variables). Consider the random vector (X, Y) where
X and Y may each be a vector. If X and Y are independent, then, F, the distribution function
of (X, Y), may be written

F(x, y) = F X (x)F Y (y)
for all x, y, where F X denotes the distribution function of the marginal distribution of X
and F Y denotes the distribution function of the marginal distribution of Y.
Hence, the conditional distribution of X given Y = y, q(·, y) must satisfy

dF X (x) dF Y (y) = q(A, y) dF Y (y).
B A B
Clearly, this equation is satisﬁed by

q(A, y) = dF X (x)
A
so that

Pr(X ∈ A|Y = y) = dF X (x) = Pr(X ∈ A).
A
Two important issues are the existence and uniqueness of conditional probability distri-
butions. Note that, for ﬁxed A,if B satisﬁes Pr(Y ∈ B) = 0, then Pr(X ∈ A, Y ∈ B) = 0.
The Radon-Nikodym Theorem now guarantees the existence of a function q(A, ·) satisfying
(2.3). Furthermore, it may be shown that this function may be constructed in such a way
that q(·, y) deﬁnes a probability distribution on X for each y. Thus, a conditional proba-
bility distribution always exists. Formal proofs of these results are quite difﬁcult and are
beyond the scope of this book; see, for example, Billingsley (1995, Chapter 6) for a detailed
discussion of the technical issues involved.
If, for a given set A, q 1 (A, ·) and q 2 (A, ·) satisfy

q 1 (A, y) dF Y (y) = q 2 (A, y) dF Y (y)
B B
for all B ⊂ Y and q 1 (A, y) = Pr(X ∈ A|Y = y), then q 2 (A, y) = Pr(X ∈ A|Y = y)as
well. In this case, q 1 (A, y) and q 2 (A, y) are said to be two versions of the conditional
probability. The following result shows that, while conditional probabilities are not unique,
they are essentially unique.

Lemma 2.2. Let (X, Y) denote a random vector with range X × Y and let q 1 (·, y) and
q 2 (·, y) denote two versions of the conditional probability distribution of X given Y = y.
Fora given set A ⊂ X, let
Y 0 ={y ∈ Y: q 1 (A, y) = q 2 (A, y)}.

Then Pr(Y ∈ Y 0 ) = 0.

56 57 58 59 60 61 62 63 64 65 66