Page 508 - Probability and Statistical Inference
P. 508
10. Bayesian Methods 485
posterior pdf k(θ; t) which is directly affected by the choice of h(θ). This is
why it is of paramount importance that the prior pmf or pdf h(θ) is fixed in
advance of data collection so that both the evidences regarding obtained
from the likelihood function and the prior distribution remain useful and cred-
ible.
10.4 Point Estimation
In this section, we explore briefly how one approaches the point estimation
problems of an unknown parameter under a particular loss function. Recall
that the data consists of a random sample X = (X , ..., X ) given that = θ.
1
n
Suppose that a real valued statistic T is (minimal) sufficient for θ given that
= θ. Let T denote the domain of t. As before, instead of considering the
likelihood function itself, we will only consider the pmf or pdf g(t; θ) of the
sufficient statistic T at the point T = t given that v = θ, for all t ∈ T. Let h(θ)
be the prior distribution of , θ ∈ Θ.
An arbitrary point estimator of may be denoted by δ ≡ δ(T) which takes
the value δ(t) when one observes T = t, t ∈ T. Suppose that the loss in estimat-
ing by the estimator θ(T) is given by
which is referred to as the squared error loss.
The mean squared error (MSE) discussed in Section 7.3.1 will correspond
to the weighted average of the loss function from (10.4.1) with respect to the
weights assigned by the pmf or pdf g(t; θ). In other words, this average is
actually a conditional average given that = θ. Let us define, conditionally
given that = θ, the risk function associated with the estimator θ:
This is the frequentist risk which was referred to as MSE in the Section 7.3.
d
In Chapter 7, we saw examples of estimators δ and δ with risk functions
1
2
R*(θ, δ ), i = 1, 2 where R*(θ, δ ) > R*(θ, δ ) for some parameter values θ
1
2
i
whereas R*(θ, δ ) ≤ R*(θ, δ ) for other parameter values θ. In other words,
1
2
by comparing the two frequentist risk functions of δ and δ , in some situa-
1
2
tions one may not be able to judge which estimator is decisively superior.
The prior h(θ) sets a sense of preference and priority of some values of
v over other values of . In a situation like this, while comparing two
estimators δ and δ , one may consider averaging the associated frequentist
1 2

