Page 509 - Probability and Statistical Inference

P. 509

486 10. Bayesian Methods

risks R*(θ, δ ), i = 1, 2, with respect to the prior h(θ) and then check to see
i
which weighted average is smaller. The estimator with the smaller average
risk should be the preferred estimator.
So, let us define the Bayesian risk (as opposed to the frequentist risk)

Suppose that D is the class of all estimators of whose Bayesian risks are
finite. Now, the best estimator under the Bayesian paradigm will be δ* from D
such that

Such an estimator will be called the Bayes estimator of . In many standard
problems, the Bayes estimator δ* happens to be unique.
Let us suppose that we are going to consider only those estimators δ
and prior h(θ) so that both R*(θ, δ) and r*(v, δ) are finite, θ ∈ Θ.

Theorem 10.4.1 The Bayes estimator δ* = δ*(T) is to be determined in
such a way that the posterior risk of δ*(t) is the least possible, that is

for all possible observed data t ∈ T.
Proof Assuming that m(t) > 0, let us express the Bayesian risk in the
following form:

In the last step, we used the relation g(t; θ)h(θ) = k(t; θ)m(t) and the fact that
the order of the double integral ∫ ∫ can be changed to ∫ ∫ because the inte-
T Θ
Θ T
grands are non-negative. The interchanging of the order of the integrals is
allowed here in view of a result known as Fubinis Theorem which is stated
as Exercise 10.4.10 for the reference.
Now, suppose that we have observed the data T = t. Then, the Bayes
estimate δ*(t) must be the one associated with the
that is the smallest posterior risk. The proof is complete. !
An attractive feature of the Bayes estimator δ* is this: Having observed
T = t, we can explicitly determine δ*(t) by implementing the process of
minimizing the posterior risk as stated in (10.4.5). In the case of the squared

504 505 506 507 508 509 510 511 512 513 514