Page 425 - Numerical Methods for Chemical Engineering
P. 425

414     8 Bayesian statistics and parameter estimation



                   linearized design matrix. In cases where it may be very costly to come back later to perform
                   additional experiments, we may wish to try multiple estimates of θ, repeat the eigenvalue
                   analysis for each linearized design matrix, and accept only a design that appears to provide
                   sufficient accuracy for all plausible values of θ.


                   Example. Determining the number of additional experiments
                   necessary for the protein expression data

                   We consider once again the data for the protein expression levels of wild-type and mutant
                                           T
                   bacterial strains (8.35) with X X and its inverse again given by (8.40). For a specified σ,
                   the standard deviation of θ 2 is
                                                               √
                                                  ,
                                                      T
                                         std(θ 2 ) = σ (X X) −1  = σ 2n −1           (8.176)
                                                         22
                   The expected width of the confidence interval in this parameter is then
                                                            √
                                                                −1

                                             θ 2 − θ M,2 ≈ Z α/2 σ 2n                (8.177)
                   Or, to account roughly for the extra uncertainty in σ, we could use
                                                             √
                                                                 −1
                                           θ 2 − θ M,2 ≈ T n−2,α/2 s 2n              (8.178)

                   We can use (8.178) with n = 4 + m and the s-value from the existing data to estimate the
                   number m of additional experiments necessary to reduce the uncertainty in θ 2 to a desired
                   level.
                     Here, our emphasis has been upon experimental design; however, eigenvalue analysis
                                                                                        T
                   and SVD of the design matrix can also be used to extract at least partial results when X X
                   is singular. This subject is discussed in further detail in the supplemental material in the
                   accompanying website.


                   Bayesian multiresponse regression


                   Previously, we have considered only the analysis of single-response data. Here, we discuss
                   multiresponse regression, focusing primarily upon the extension of the least-squares method
                   to the case of multiple, perhaps correlated, responses in each experiment.
                     Again, we perform a number N of experiments, where in the kth experiment, we
                                                              M
                   have a known set of M predictor variables, x [k]  ∈  , and we observe the L responses
                         L
                                                                                P
                   y [k]  ∈  . We wish to estimate the values of P unknown parameters θ ∈  , in a model
                                                                               L
                                                                       [k]
                   whose predicted responses for each experiment form a vector f (x ; θ) ∈  . We assume
                   that the measured responses are equal to the model predictions plus a random error
                   vector,
                                                      [k]


                                             y [k]  = f x ; θ + ε [k]                (8.179)
                          L
                   ε [k]  ∈  is assumed to be independent of the other ε [l =k] , but we allow that the components
                   of ε [k]  may be correlated. The L × L covariance matrix (unknown) of each error vector is
   420   421   422   423   424   425   426   427   428   429   430