Page 505 - Probability and Statistical Inference

P. 505

482 10. Bayesian Methods

beta distributions, then k(θ; t) must correspond to the pdf of an appropriate
beta distribution. Similarly, for conjugacy, if h(θ) is chosen from the family of
normal or gamma distributions, then k(θ; t) must correspond to the pdf of an
appropriate normal or gamma distribution respectively. The property of
conjugacy demands that h(θ) and k(θ; t) must belong to the same family of
distributions.
Example 10.3.1 (Example 10.2.1 Continued) Suppose that we have the
random variables X , ..., X which are iid Bernoulli(θ) given that = θ, where
n
1
is the unknown probability of success, 0 < < 1. Given that = θ, the
statistic is minimal sufficient for θ, and recall that one has
for t ∈ T = {0, 1, ..., n}.
In the expression for g(t; θ), carefully look at the part which depends on θ,
t
n-t
namely θ (1 - θ) . It resembles a beta pdf without the normalizing constant.
Hence, we suppose that the prior distribution of v on the space Θ = (0, 1) is
Beta(α, β) where α (> 0) and β(> 0) are known numbers. From (10.2.2), for
t ∈ T, we then obtain the marginal pmf of T as follows:

Now, using (10.2.3) and (10.3.1), the posterior pdf of v given the data T = t
simplifies to

and fixed values t ∈ T. In other words, the posterior pdf of the success
probability v is same as that for the Beta(t + α, n t + β) distribution.

We started with the beta prior and ended up with a beta posterior.
Here, the beta pdf for v is the conjugate prior for .
In the Example 10.2.1, the uniform prior was actually the Beta(1, 1) distri-
bution. The posterior found in (10.3.2) agrees fully with that given by (10.2.6)
when α = β = 1. !
Example 10.3.2 Let X , ..., X be iid Poisson(θ) given that = θ, where
1
n
(> 0) is the unknown population mean. Given that = θ, the statistic
is minimal sufficient for θ, and recall that one has g(t; θ) = e -
n? (nθ) /t! for t ∈ T = {0, 1, 2, ...}.
t
In the expression of g(t; θ), look at the part which depends on θ, namely
e ? . It resembles a gamma pdf without the normalizing constant. Hence,
-nθ t
we suppose that the prior distribution of on the space Θ = (0, ∞) is

500 501 502 503 504 505 506 507 508 509 510