Page 72 - Elements of Distribution Theory
P. 72

P1: JZP
            052184472Xc02  CUNY148/Severini  May 24, 2005  2:29





                            58                  Conditional Distributions and Expectation

                            Conditional expectation as an approximation
                            Conditional expectation may be viewed as the solution to the following approximation
                            problem. Let X and Y denote random variables, which may be vectors. Let g(X) denote
                            a real-valued function of X and suppose that we wish to approximate the random variable
                            g(X)bya real-valued function of Y. Suppose further that we decide to measure the quality
                                                                2
                            of an approximation h(Y)by E[(g(X) − h(Y)) ]. Then the best approximation in this sense
                            is given by h(Y) = E[g(X)|Y]. This idea is frequently used in the context of statistical
                            forecasting in which X represents a random variable that is, as of yet, unobserved, while Y
                            represents the information currently available. A formal statement of this result is given in
                            the following corollary to Theorem 2.6.


                            Corollary 2.2. Let (X, Y) denote a random vector with range X × Y and let g denote a
                                                                2
                            real-valued function on X such that E[g(X) ] < ∞. Let Z = E[g(X)|Y]. Then, for any
                                                                2
                            real-valued function h on Y such that E[h(Y) ] < ∞,
                                                                             2
                                                              2
                                                 E[(h(Y) − g(X)) ] ≥ E[(Z − g(X)) ]
                            with equality if and only if h(Y) = Z with probability 1.
                            Proof. Note that
                                                                    2
                                          2
                             E[(h(Y) − g(X)) ] = E[(h(Y) − Z + Z − g(X)) ]
                                                         2
                                                                        2
                                            = E[(h(Y) − Z) ] + E[(Z − g(X)) ] + 2E{(h(Y) − Z)(Z − g(X))}.
                            Since Z is a function of Y, h(Y) − Z is a function of Y. Furthermore,
                                               E[|h(Y) − Z|] ≤ E[|h(Y)|] + E[|Z|] < ∞
                            and
                                                             2  1          2  1
                                    E[|g(X)||h(Y) − Z|] ≤ E[g(X) ] 2 E[|h(Y) − Z| ] 2
                                                                                     1
                                                                1
                                                              2
                                                                          2
                                                                                   2
                                                     ≤ E[|g(X)| ] 2 {2E[|h(Y)| ] + 2E[|Z| ]} 2 < ∞,
                            using the fact that
                                                          2
                                                                                  2
                                                                     2
                                           2
                                      E[|Z| ] = E{E[g(X)|Y] }≤ E{E[g(X) |Y]}= E[g(X) ] < ∞.
                            Hence, by Theorem 2.6,
                                                E{(h(Y) − Z)Z}= E{(h(Y) − Z)g(X)}
                            so that
                                                                                    2
                                                                     2
                                                      2
                                         E[(h(Y) − g(X)) ] = E[(h(Y) − Z) ] + E[(Z − g(X)) ].
                              It follows that
                                                                             2
                                                              2
                                                 E[(g(X) − h(Y)) ] ≥ E[(g(X) − Z) ]
                                                            2
                            with equality if and only if E[(h(Y) − Z) ] = 0, which occurs if and only if h(Y) = Z with
                            probability 1.
                            Example 2.17 (Independent random variables). Let Y denote a real-valued random vari-
                                       2
                            able with E(Y ) < ∞ and let X denote a random vector such that X and Y are independent.
   67   68   69   70   71   72   73   74   75   76   77