Page 399 - Numerical Methods for Chemical Engineering

P. 399

388 8 Bayesian statistics and parameter estimation

The least-squares method reconsidered

Our proposed prior (8.68) makes use of the assumption of prior independence,

p(θ,σ) = p(θ)p(σ) (8.72)
which states that, prior to the experiment, the belief systems about θ and σ are independent.
The posterior density is then
p(θ,σ|y) ∝ l(θ,σ|y)p(θ)p(σ) (8.73)

Assuming the Gauss–Markov conditions hold and that the errors are normally distributed,
the likelihood function is
N
1 −N 1

l(θ,σ|y) = √ σ exp − 2 S(θ) (8.74)
2π 2σ
and thus the posterior is
N
1 −N 1

p(θ,σ|y) ∝ √ σ exp − 2 S(θ) p(θ)p(σ) (8.75)
2π 2σ
If, as we have assumed in (8.68), the prior is uniform in the region of appreciable nonzero
likelihood, p(θ) ∼ c, then the most probable value of θ, for any value of σ, is that which
minimizes S(θ), (8.58). Therefore, the least-squares method is justiﬁed statistically, as long
as the Gauss–Markov conditions hold and the errors are normally distributed.
It is shown in the supplemental material in the accompanying website that, in the fre-
quentist view, least squares is an unbiased estimator of the true value (i.e., if we repeat
the set of experiments many times, the average estimate is the true value) if merely the
zero-mean Gauss–Markov condition (8.50) is satisﬁed.

Numerical treatment of nonlinear least-squares problems
[k]
For a linear model y [k] = x [k] · θ + ε , we obtain the least-squares estimate by solving an
T
T
algebraic system [X X]θ LS = X y. For a nonlinear model
[k]

y [k] = f x ; θ + ε [k] (8.76)
we must ﬁnd the least-squares estimate through numerical optimization. For notational
convenience, we deﬁne the cost function as
N
1 1 [k] [k] 2
F cost (θ) = S(θ) = y − f x ; θ (8.77)
2 2
k=1
The gradient γ = ∆F of this cost function has components
N ) *
∂F cost [k] [k] ∂ f
γ a (θ) = =− y − f x ; θ (8.78)

[k]
∂θ a ∂θ a x ;θ
k=1
We ﬁnd a local minimum by applying either the nonlinear conjugate gradient method or,
as below, a variation of Newton’s method. For the latter technique, we use the Hessian

394 395 396 397 398 399 400 401 402 403 404