Page 60 - Elements of Distribution Theory
P. 60

P1: JZP
            052184472Xc02  CUNY148/Severini  May 24, 2005  2:29





                            46                  Conditional Distributions and Expectation

                                                  2.3 Conditional Distributions

                            Consider random variables X and Y. Suppose that Y is a discrete random variable taking
                            the values 0 and 1 with probabilities θ and 1 − θ, respectively, where 0 <θ < 1. From
                            elementary probability theory we know that the conditional probability that X ∈ A given
                            that Y = y is given by
                                                                Pr(X ∈ A, Y = y)
                                               Pr(X ∈ A|Y = y) =               ,                (2.1)
                                                                    Pr(Y = y)
                            provided that y = 0, 1so that Pr(Y = y) > 0. Hence, for any set A, the conditional proba-
                            bility function Pr(X ∈ A|Y = y) satisfies the equation
                                   Pr(X ∈ A) = Pr(X ∈ A, Y = 0) + Pr(X ∈ A, Y = 1)
                                            = Pr(X ∈ A|Y = 0)Pr(Y = 0) + Pr(X ∈ A|Y = 1)Pr(Y = 1)
                                                ∞

                                            =     Pr(X ∈ A|Y = y) dF Y (y).
                                               −∞
                            Furthermore, for any subset B of {0, 1},

                               Pr(X ∈ A, Y ∈ B) =   Pr(X ∈ A, Y = y) =   Pr(X ∈ A|Y = y)Pr(Y = y)
                                                 y∈B                  y∈B

                                               =    Pr(X ∈ A|Y = y) dF Y (y).                   (2.2)
                                                  B
                              Now suppose that Y has an absolutely continuous distribution and consider Pr(X ∈
                            A|Y = y). If the distribution of Y is absolutely continuous, then Pr(Y = y) = 0 for all y so
                            that (2.1) cannot be used as a definition of Pr(X ∈ A|Y = y). Instead, we use a definition
                            based on a generalization of (2.2).
                              Let (X, Y) denote a random vector, where X and Y may each be vectors, and let X × Y
                            denote the range of (X, Y). In general, the conditional distribution of X given Y = y is a
                            function q(A, y), defined for subsets A ⊂ X and elements y ∈ Y such that for B ⊂ Y

                                                Pr(X ∈ A, Y ∈ B) =  q(A, y) dF Y (y)            (2.3)
                                                                  B
                            where F Y denotes the marginal distribution function of Y and such that for each fixed
                            y ∈ Y, q(·, y) defines a probability distribution on X. The quantity q(A, y) will be denoted
                            by Pr(X ∈ A|Y = y).


                            Example 2.8 (Two-dimensional discrete random variable). Let (X, Y) denote a two-
                            dimensional discrete random variable with range

                                                    {1, 2,..., m}×{1, 2,..., n}.
                            For each i = 1, 2,..., m let

                                                     q i (y) = Pr(X = i|Y = y).
                            Then, according to (2.3), q 1 (y),..., q m (y) must satisfy

                                                 Pr(X = i, Y = j) = q i ( j)Pr(Y = j)
                            for each i = 1,..., m and j = 1,..., n.
   55   56   57   58   59   60   61   62   63   64   65