Page 508 - Probability and Statistical Inference

P. 508

10. Bayesian Methods 485

posterior pdf k(θ; t) which is directly affected by the choice of h(θ). This is
why it is of paramount importance that the prior pmf or pdf h(θ) is fixed in
advance of data collection so that both the evidences regarding obtained
from the likelihood function and the prior distribution remain useful and cred-
ible.

10.4 Point Estimation

In this section, we explore briefly how one approaches the point estimation
problems of an unknown parameter under a particular loss function. Recall
that the data consists of a random sample X = (X , ..., X ) given that = θ.
1
n
Suppose that a real valued statistic T is (minimal) sufficient for θ given that
= θ. Let T denote the domain of t. As before, instead of considering the
likelihood function itself, we will only consider the pmf or pdf g(t; θ) of the
sufficient statistic T at the point T = t given that v = θ, for all t ∈ T. Let h(θ)
be the prior distribution of , θ ∈ Θ.
An arbitrary point estimator of may be denoted by δ ≡ δ(T) which takes
the value δ(t) when one observes T = t, t ∈ T. Suppose that the loss in estimat-
ing by the estimator θ(T) is given by

which is referred to as the squared error loss.
The mean squared error (MSE) discussed in Section 7.3.1 will correspond
to the weighted average of the loss function from (10.4.1) with respect to the
weights assigned by the pmf or pdf g(t; θ). In other words, this average is
actually a conditional average given that = θ. Let us define, conditionally
given that = θ, the risk function associated with the estimator θ:

This is the frequentist risk which was referred to as MSE in the Section 7.3.
d
In Chapter 7, we saw examples of estimators δ and δ with risk functions
1
2
R*(θ, δ ), i = 1, 2 where R*(θ, δ ) > R*(θ, δ ) for some parameter values θ
1
2
i
whereas R*(θ, δ ) ≤ R*(θ, δ ) for other parameter values θ. In other words,
1
2
by comparing the two frequentist risk functions of δ and δ , in some situa-
1
2
tions one may not be able to judge which estimator is decisively superior.
The prior h(θ) sets a sense of preference and priority of some values of
v over other values of . In a situation like this, while comparing two
estimators δ and δ , one may consider averaging the associated frequentist
1 2

503 504 505 506 507 508 509 510 511 512 513