Page 307 - Probability and Statistical Inference
P. 307
284 6. Sufficiency, Completeness, and Ancillarity
µ)/σ, is not a statistic because it involves the unknown parameter µ, and
hence its value associated with any observed data x , ..., x can not be calcu-
n
1
lated.
Definition 6.2.2 A real valued statistic T is called sufficient (for the un-
known parameter θ) if and only if the conditional distribution of the random
sample X = (X , ..., X ) given T = t does not involve θ, for all t ∈ , the
n
1
domain space for T.
In other words, given the value t of a sufficient statistic T, conditionally
there is no more information left in the original data regarding the unknown
parameter θ. Put another way, we may think of X trying to tell us a story
about θ, but once a sufficient summary T becomes available, the original
story then becomes redundant. Observe that the whole data X is always suf-
ficient for θ in this sense. But, we are aiming at a shorter summary statistic
which has the same amount of information available in X. Thus, once we find
a sufficient statistic T, we will focus only on the summary statistic T. Before
we give other details, we define the concept of joint sufficiency of a vector
valued statistic T for an unknown parameter θ.
Definition 6.2.3 A vector valued statistic T ≡ (T , ..., T ) where T ≡ T (X ,
i
i
k
1
1
..., X ), i = 1, ..., k, is called jointly sufficient (for the unknown parameter θ)
n
if and only if the conditional distribution of X = (X , ..., X ) given T = t does
n
1
k
not involve θ, for all t ∈ ⊆ ℜ .
The Section 6.2.1 shows how the conditional distribution of X given T =
t can be evaluated. The Section 6.2.2 provides the celebrated Neyman factor-
ization which plays a fundamental role in locating sufficient statistics.
6.2.1 The Conditional Distribution Approach
With the help of examples, we show how the Definition 6.2.2 can be applied
to find sufficient statistics for an unknown parameter θ.
Example 6.2.1 Suppose that X , ..., X are iid Bernoulli(p), where p is the
n
1
unknown parameter, 0 < p < 1. Here, χ = {0, 1}, θ = p, and Θ = (0, 1). Let us
consider the specific statistic . Its values are denoted by t ∈ =
{0, 1, 2, ..., n}. We verify that T is sufficient for p by showing that the
conditional distribution of (X , ..., X ) given T = t does not involve p, what-
n
1
ever be t ∈ . From the Examples 4.2.2-4.2.3, recall that T has the Binomial(n,
p) distribution. Now, we obviously have:
But, when , since is a subset of B = {T = t},