Page 383 - Numerical Methods for Chemical Engineering
P. 383

8 Bayesian statistics and

                   parameter estimation













                   Throughout this text, we have considered algorithms to perform simulations – given a model
                   of a system, what is its behavior? We now consider the question of model development.
                   Typically, to develop a model, we postulate a mathematical form, hopefully guided by
                   physical insight, and then perform a number of experiments to determine the choice of
                   parameters that best matches the model behavior to that observed in the set of experiments.
                   This procedure of model proposition and comparison to experiment generally must be
                   repeated iteratively until the model is deemed to be sufficiently reliable for the purpose at
                   hand. The problem of drawing conclusions from data is known as statistical inference, and
                   in particular, our focus here is upon parameter estimation. We use the powerful Bayesian
                   framework for statistics, which provides a coherent approach to statistical inference and a
                   procedure for making optimal decisions in the presence of uncertainty. We build upon the
                   concepts of the last chapter and find, in particular, Monte Carlo simulation to be a powerful
                   and general tool for Bayesian statistics.


                   General problem formulation


                   The basic parameter estimation,or regression, problem involves fitting the parameters of a
                   proposed model to agree with the observed behavior of a system (Figure 8.1). We assume
                   that, in any particular measurement of the system behavior, there is some set of predictor
                                M
                   variables x ∈   that fully determines the behavior of the system (in the absence of any
                   random noise or error). For each experiment, we measure some set of response variables
                         L
                   y (r)  ∈  .If L = 1, we have single-response data and if L > 1, multiresponse data. We write
                   these predictor and response vectors in row form,
                                     x = [x 1 x 2 ... x M ]  y (r)  = [y 1 y 2 ... y L ]  (8.1)
                                                                                     (r)
                   We propose a mathematical relation, which maps the predictors x to the responses y , that
                                                             P
                   involves a set of adjustable model parameters θ ∈  , whose values we wish to estimate
                   from the measured response data. Let us say that we have a set of N experiments, in which
                   for experiment k = 1, 2,... N, x [k]  is the row vector of predictor variables and the row
                                                [k]
                   vector of measured response data is y . For each experiment, we have a model prediction
                   of the response
                                                          [k]
                                                [k]

                                               ˆ y (θ) = f x ; θ                       (8.2)
           372
   378   379   380   381   382   383   384   385   386   387   388