Page 72 - Elements of Distribution Theory
P. 72
P1: JZP
052184472Xc02 CUNY148/Severini May 24, 2005 2:29
58 Conditional Distributions and Expectation
Conditional expectation as an approximation
Conditional expectation may be viewed as the solution to the following approximation
problem. Let X and Y denote random variables, which may be vectors. Let g(X) denote
a real-valued function of X and suppose that we wish to approximate the random variable
g(X)bya real-valued function of Y. Suppose further that we decide to measure the quality
2
of an approximation h(Y)by E[(g(X) − h(Y)) ]. Then the best approximation in this sense
is given by h(Y) = E[g(X)|Y]. This idea is frequently used in the context of statistical
forecasting in which X represents a random variable that is, as of yet, unobserved, while Y
represents the information currently available. A formal statement of this result is given in
the following corollary to Theorem 2.6.
Corollary 2.2. Let (X, Y) denote a random vector with range X × Y and let g denote a
2
real-valued function on X such that E[g(X) ] < ∞. Let Z = E[g(X)|Y]. Then, for any
2
real-valued function h on Y such that E[h(Y) ] < ∞,
2
2
E[(h(Y) − g(X)) ] ≥ E[(Z − g(X)) ]
with equality if and only if h(Y) = Z with probability 1.
Proof. Note that
2
2
E[(h(Y) − g(X)) ] = E[(h(Y) − Z + Z − g(X)) ]
2
2
= E[(h(Y) − Z) ] + E[(Z − g(X)) ] + 2E{(h(Y) − Z)(Z − g(X))}.
Since Z is a function of Y, h(Y) − Z is a function of Y. Furthermore,
E[|h(Y) − Z|] ≤ E[|h(Y)|] + E[|Z|] < ∞
and
2 1 2 1
E[|g(X)||h(Y) − Z|] ≤ E[g(X) ] 2 E[|h(Y) − Z| ] 2
1
1
2
2
2
≤ E[|g(X)| ] 2 {2E[|h(Y)| ] + 2E[|Z| ]} 2 < ∞,
using the fact that
2
2
2
2
E[|Z| ] = E{E[g(X)|Y] }≤ E{E[g(X) |Y]}= E[g(X) ] < ∞.
Hence, by Theorem 2.6,
E{(h(Y) − Z)Z}= E{(h(Y) − Z)g(X)}
so that
2
2
2
E[(h(Y) − g(X)) ] = E[(h(Y) − Z) ] + E[(Z − g(X)) ].
It follows that
2
2
E[(g(X) − h(Y)) ] ≥ E[(g(X) − Z) ]
2
with equality if and only if E[(h(Y) − Z) ] = 0, which occurs if and only if h(Y) = Z with
probability 1.
Example 2.17 (Independent random variables). Let Y denote a real-valued random vari-
2
able with E(Y ) < ∞ and let X denote a random vector such that X and Y are independent.