Page 364 - Probability and Statistical Inference
P. 364

7




                           Point Estimation



                           7.1 Introduction

                           We begin with iid observable random variables X , ..., X  having a common
                                                                     1
                                                                           n
                                            χ
                           pmf or pdf f(x), x ∈  , the domain space for x. Here, n is assumed known and
                           it is referred to as the (fixed) sample size. We like to think of a population
                           distribution which in practice may perhaps be approximated by f(x). For ex-
                           ample, in a population of two thousand juniors in a college campus, we may
                           be interested in the average GP A(= X) and its distribution which is denoted
                           by f(x) with an appropriate domain space  χ  = (0, 4) for x. The population
                           distribution of X may be characterized or indexed by some parameter θθ θθ θ, for
                           example the median GP A of the population. The practical significance of θθ θθ θ is
                           that once we fix a value of θθ θθ θ, the population distribution f(x) would then be
                           completely specified. Thus, we denote the pmf or pdf by f(x; θθ θθ θ), instead of
                           just f(x), so that its dependence on a few specific population-features (that is,
                           θ θ θ θ θ) is now made explicit. The idea of indexing a population distribution by the
                           unknown parameter θθ θθ θ was also discussed in Chapter 6.
                              We suppose that the parameter θθ θθ θ is fixed but otherwise unknown and that
                                                                         ⊆
                                                                            k
                           the possible values of θθ θθ θ belong to a parameter space Θ   ℜ . For example, we
                           may be able to postulate that the X’s are distributed as N (µ, σ ) where µ is the
                                                                              2
                           only unknown parameter, −∞ < µ < ∞, 0 < σ < ∞. In this case we may denote
                           the pdf of X by f(x; µ) where θ = µ ∈ Θ = ℜ,  χ  = ℜ. If µ and σ  are both
                                                                                   2
                           unknown, then the population density would be denoted by f(x; θθ θθ θ) where θθ θθ θ =
                                             + χ
                               2
                           (µ, σ ) ∈ Θ = ℜ × ℜ ,   = ℜ.
                              The Definition 7.2.1 gives a formal statement of what an estimator of a
                           parameter is. In this chapter, we consider only point estimation problems. In
                           Section 7.2, we first apply the method of moments, originated by Karl Pearson
                           (1902), to find estimators of θθ θθ θ. This approach is ad hoc in nature, and hence
                           we later introduce a more elaborate way of finding estimators by the method
                           of maximum likelihood. The latter approach was pioneered by Fisher (1922,
                           1925a, 1934).
                              One may arrive at different choices of estimators for the unknown pa-
                           rameter θθ θθ θ and hence some criteria to compare their performances are ad-
                           dressed in Section 7.3. One criterion which stands out more than anything
                                                          341
   359   360   361   362   363   364   365   366   367   368   369