Page 312 - Probability and Statistical Inference
P. 312

6. Sufficiency, Completeness, and Ancillarity  289

                           X  = x , i = 1, ..., n, would stand for the corresponding likelihood function
                                i
                            i
                           L(θ). We will give several examples of L(θ) shortly.
                              One should note that once the data {x ; i = 1, ..., n} has been observed,
                                                              i
                           there are no random quantities in (6.2.4), and so the likelihood L(.) is simply
                           treated as a function of the unknown parameter θ alone.
                                    The sample size n is assumed known and fixed before
                                                the data collection begins.

                              One should note that θ can be real or vector valued in this general discus-
                           sion, however, let us pretend for the time being that θ is a real valued param-
                           eter. Fisher (1922) discovered the fundamental idea of factorization. Neyman
                           (1935a) rediscovered a refined approach to factorize the likelihood function in
                           order to find sufficient statistics for θ. Halmos and Savage (1949) and Bahadur
                           (1954) gave more involved measure-theoretic treatments.
                              Theorem 6.2.1 (Neyman Factorization Theorem) Consider the likeli-
                           hood function L(θ) from (6.2.4). A real valued statistic T = T(X , ..., X ) is
                                                                                  1
                                                                                        n
                           sufficient for the unknown parameter θ if and only if the following factoriza-
                           tion holds:


                           where the two functions g(.; θ) and h(.) are both nonnegative, h(x , ..., x ) is
                                                                                   1    n
                           free from θ, and g(T(x , ..., x );θ) depends on x , ..., x  only through the
                                                                     1
                                                     n
                                               1
                                                                           n
                           observed value T(x , ..., x ) of the statistic T.
                                           1    n
                              Proof For simplicity, we will provide a proof only in the discrete case. Let
                           us write X = (X , ..., X ) and x = (x , ..., x ). Let the two sets A and  B
                                                n
                                                                  n
                                         1
                                                            1
                           respectively denote the events X = x and T(X) = T(x), and observe that A ⊆ B.
                           Only if part: Suppose that T is sufficient for θ. Now, we write
                           Comparing (6.2.5)-(6.2.6), let us denote g(T(x , ..., x );θ) = P {T(X) = T(x)}
                                                                               θ
                                                                       n
                                                                  1
                           and h(x , ..., x ) = P {X = x |T(X) = T(x)}. But, we have assumed that T is
                                            θ
                                       n
                                 1
                           sufficient for θ and hence by the Definition 6.2.2 of sufficiency, the condi-
                           tional probability P {X = x |T(X) = T(x)} cannot depend on the parameter θ.
                                           θ
                           Thus, the function h(x , ..., x ) so defined may depend only on x , ..., x .
                                                                                    1
                                              1
                                                    n
                                                                                          n
                           The factorization given in (6.2.5) thus holds. The “only if” part is now
                           complete.¿
   307   308   309   310   311   312   313   314   315   316   317