Page 500 - Probability and Statistical Inference
P. 500

10



                           Bayesian Methods



                           10.1 Introduction

                           The nature of what we are going to discuss in this chapter is conceptually
                           very different from anything we had included in the previous chapters. Thus
                           far, we have developed methodologies from the point of view of a frequentist.
                           We started with a random sample X , ..., X  from a population having the pmf
                                                        1
                                                              n
                           or pdf f(x;  ) where x ∈ χ and   ∈ Θ. The unknown parameter v is assumed
                           fixed. A frequentist’s inference procedures depended on the likelihood func-
                           tion denoted earlier by               where   is unknown but fixed.
                              In the Bayesian approach, the experimenter believes from the very begin-
                           ning that the unknown parameter   is a random variable having its own
                           probability distribution on the space Θ. Now that   is assumed random, the
                           likelihood function will be same as L(θ) given that   = θ. Let us denote the
                           pmf or pdf of   by h(θ) at the point   = θ which is called the prior distribu-
                           tion of  .

                                In previous chapters, we wrote f(x; θ) for the pmf or pdf of X.
                               Now, f(x; θ) denotes the conditional pmf or pdf of X given   = θ.
                                The unknown parameter v is assumed a random variable with its
                                 own distribution h(θ) when   = θ ∈ Θ, the parameter space.

                              The prior distribution h(θ) often reflects an experimenter’s subjective be-
                           lief regarding which v values are more (or less) likely when one considers the
                           whole parameter space Θ. The prior distribution is ideally fixed before the
                           data gathering begins. An experimenter may utilize related expertise and other
                           knowledge in order to come up with a realistic prior distribution h(θ).
                              The Bayesian paradigm requires one to follow along these lines: perform
                           all statistical inferences and analysis after combining the information about
                             contained in the collected data as evidenced by the likelihood function
                           L(θ) given that   = θ, as well as that from the prior distribution h(θ). One
                           combines the evidences about v derived from both the prior distribution and
                           the likelihood function by means of the Bayes’s Theorem (Theorem 1.4.3).
                           After combining the information from these two sources, we come up
                                                          477
   495   496   497   498   499   500   501   502   503   504   505