Page 393 - Numerical Methods for Chemical Engineering
P. 393

382     8 Bayesian statistics and parameter estimation



                   approach – is based upon the relative occurrences of events in many repetitive trials. Let
                   us say that the probability of observing an event E in an independent random trial is p(E).
                   The frequentist way of defining the value of the probability of observing E, 0 ≤ p(E) ≤ 1,
                   is to say that if we perform a large number T of such trials, with the observed number of
                   occurrences of E being N E , then p(E) ≈ N E /T .
                     We can also define probabilities as statements of belief (de Finetti, 1970). I say that the
                   probability of observing E during a random trial is p(E) if I have no reason to prefer one of
                   the following two bets over the other:

                                       event E is observed in a particular trial;
                                                      or
                    a perfectly uniform random number generator in [0, 1] returns a value u that is less than
                                                    p(E).
                   It is then necessary for me, as the holder of this belief system, to ensure that the probability
                   values that I assign satisfy the appropriate conditions, e.g. are nonnegative, sum or integrate
                   to 1, follow all laws of conditional and joint probabilities, etc.
                     Bayesian statistics is based upon manipulation of the probability p(θ| y) that the model
                   has a parameter vector θ, given a set of measured response data y. While the Bayesian
                   approach dates to the work of Thomas Bayes in the mid-1700s, it was slow to gain acceptance
                   because, by treating θ as a random vector, it violates the philosophical principle of Laplacian
                   determinism, stating that nature is deterministic and predictable. Such criticisms were muted
                   by the interpretation of probabilities as statements of belief, leading to a resurgence in the
                   Bayesian approach.
                     But by resorting to the use of a belief system to define p(θ| y), we introduce the issue of
                   subjectivity, as it is possible that two different analysts will hold different belief systems,
                   and thus arrive at different conclusions from the same data. The complaints of the frequen-
                   tist school about the subjectivity of the Bayesian approach persist to this day, but can be
                   countered by showing that implementations of the Bayesian paradigm exist such that two
                   analysts, given the same data and working independently, reach (nearly) the same conclu-
                   sions. These implementations are not quite objective, but they are highly reproducible from
                   one analyst to another.
                     The development of tools to ensure the use of such “objective” belief systems, and
                   computational methods such as Monte Carlo simulation have brought the Bayesian approach
                   to the fore in recent years in areas such as parameter estimation, statistical learning, and
                   statistical decision theory. Here, we focus our attention primarily upon parameter estimation,
                   first restricting our discussion to single-response data.


                   Bayes’ theorem
                   Bayesian analysis is based upon Bayes’ theorem, itself simply an axiom of probability
                   theory. It is not the theorem that is controversial; it is its application to statistics. Thus, it
                   is best first to understand the theorem, before considering how it is applied to statistical
                   inference.
                     Let us consider two random events E 1 and E 2 that are not mutually exclusive (i.e. none,
                   one, or both may occur). Let the probability that E 1 occurs be P(E 1 ) and the probability
   388   389   390   391   392   393   394   395   396   397   398