Page 307 - Statistics for Environmental Engineers

P. 307

L1592_frame_C35 Page 314 Tuesday, December 18, 2001 2:52 PM

0. 15 200
0.05 0. 05 150
Growth Rate (1/hr) 0.05 95% joint 0.0058 S = 0.0078 100
Approximate
θ
0.10
2
confidence
region
R
^
θ
θ ^ 1 = 0.143 50
2 = 49.1
0 0
0 100 200 300 400 0 0.10 0.2 0.3 0.4
Substrate Concentration (mg/L) θ 1
FIGURE 35.3 Monod model ﬁtted to four of the original ﬁve points and one additional point at a higher substrate concentration
(left) and the resulting joint conﬁdence region of the parameters (right).

The Problem of Parameter Correlation

Parameter correlation means that the estimate of one parameter is related to the estimate of another.
Parameter correlation is what causes elongated joint conﬁdence regions. Here we look at the importance
of experimental design in reducing parameter correlation.
The location of observations is crucial and making a large number of observations at the wrong
locations does not overcome the weakness of a bad experimental design. A great many articles on the
effect of temperature, pH, metal concentration, etc. on reactions contain parameters estimated from weak
designs (Berthouex and Szewczyk, 1984). The efﬁciency of aeration equipment was sometimes estimated
using experimental designs that could not yield precise parameter estimates (Boyle et al., 1974).
Asymptotic functions, which are common in environmental modeling, present a particular problem
in parameter estimation that we will illustrate with the classical ﬁrst-order model for long-term BOD:
y i = θ 1 1 – exp – ( θ 2 t i )] + e i
[
where y i is the BOD measured at time t i . The ultimate BOD (θ 1 ) is approached asymptotically as time
goes on. θ 2 is the ﬁrst-order reaction rate coefﬁcient. The reaction is slow and t is measured in days.
Each observation of y comes from incubating a test specimen for time t and, as a result, the y values
and their errors (e) are independent. We further assume that the errors are normally distributed and have
constant variance.
Many published papers show estimates of θ 1 and θ 2 derived from measurements at just a few early
times, that is, at days 1 through 5. The experimenters may have reasoned something like this: “I am
interested mainly in an estimate of the rate constant, and I want to get the data in the shortest time
possible. If θ 2 is known, I can compute θ 1 . Because the reaction rate is high on days 1 through 5, data
over this range will give a good estimate of θ 2 and the experiment can be ﬁnished within a few days.”
This plausible argument is wrong. The experiment gives poor estimates of both θ 1 and θ 2 . The rate
constant can only be precisely estimated if the asymptotic level is well estimated, and this requires
having measurements at 15 to 20 days no matter how many measurements are made at early times. The
problem is that data at early times only allow estimation of the initial slope of the curve, which is:

dy = – ( θ 2 t) = θ 2 θ 1 =
------ θ 2 θ 1 exp t=0 constant
dt t=0
Thus, any values of θ 1 and θ 2 that have a product nearly equal to the slope of the curve over the ﬁrst
few days will reasonably ﬁt the data collected on the ﬁrst few days. The shape of the joint conﬁdence
region will be similar to Figure 34.1d. The hyperbolic shape reﬂects the parameter correlation and shows
that neither parameter is well estimated. Well-designed experiments will yield precise, uncorrelated
parameter estimates and the joint conﬁdence region will tend toward being elliptical.

302 303 304 305 306 307 308 309 310 311 312