Page 306 - Probability and Statistical Inference

P. 306

6. Sufficiency, Completeness, and Ancillarity 283

For simplicity, however, let us assume first that θ is a single parameter and
denote the population pmf or pdf by f(x; θ) so that the dependence of the
features of the underlying distribution on the parameter θ becomes explicit. In
classical statistics, we assume that this parameter θ is fixed but otherwise
unknown while all possible values of θ ∈ Θ, called the parameter space, Θ ⊆
ℜ, the real line. For example, in a problem, we may postulate that the Xs are
distributed as N(µ, σ ) where µ is the unknown parameter, ∞ < µ < ∞, but
2
σ(> 0) is known. In this case we may denote the pdf by f(x; θ) with θ = µ
while the parameter space Θ = ℜ. But if both the parameters µ and σ are
2
unknown, the population density would be denoted by f(x; θθ θθ θ) where the pa-
rameter vector is θθ θθ θ = (µ, σ ) ∈ Θ = ℜ×ℜ . This is the idea behind indexing a
+
2
population distribution by the unknown parameters in general.
From the context, it should become clear whether the unknown
parameter is real valued (θ) or vector valued (θθ θθ θ).

Consider again the observable real valued iid random variables X , ..., X
1
from a population with the common pmf or pdf f(x; θ) where θ(∈ Θ) is the n
unknown parameter. Our quest for gaining information about the unknown
parameter θ can safely be characterized as the core of statistical inference.
The data, of course, has all the information about θ even though we have not
yet specified how to quantify this information. In Section 6.4, we address
this. A data can be large or small, and it may be nice or cumbersome, but it is
ultimately incumbent upon the experimenter to summarize this data so that all
interesting features are captured by its summary. That is, the summary should
preferably have the exact same information about the unknown parameter θ
as does the original data. If one can prepare such a summary, then this would
be as good as the whole data as far as the information content regarding the
unknown parameter θ is concerned. We would call such a summary suffi-
cient for θ and make this basic idea more formal as we move along.
Definition 6.2.1 Any observable real or vector valued function T ≡
T(X , ..., X ), of the random variables X , ..., X is called a statistic.
1 n 1 n
Some examples of statistics are , X (X X ), S and so on.
2
1
2
n:n
As long as the numerical evaluation of T, having observed a specific data
X = x , ..., X = x , does not depend on any unknown quantities, we will
n
1
1
n
call T a statistic. Supposing that X , ..., X are iid N(µ, σ ) where µ is
2
n
1
unknown, but σ is known, T = is a statistic because the value of T
associated with any observed data x , ..., x can be explicitly calculated. In
n
1
the same example, however, the standardized form of , namely

301 302 303 304 305 306 307 308 309 310 311