Page 179 - Statistics for Environmental Engineers

P. 179

L1592_frame_C21 Page 178 Tuesday, December 18, 2001 2:43 PM

Example 21.2

y
A random sample of n = 5 observations yields the values = 28.4 µg/L and s = 1.18 µg/L. An
additional m = 10 specimens are to be taken at random from the same population.
1. Construct a two-sided (simultaneous) 95% prediction interval to contain the concentrations
of all 10 additional specimens. For n = 5, m = 10, and α = 0.05, the factor is 5.23 from the
second row of Table 21.2. The prediction interval is:

28.4 ± 5.23(1.18) = [22.2, 34.6]

We are 95% conﬁdent that the concentration of all 10 specimens will be contained within the
interval 22.2 to 34.6 µg/L.
2. Construct a two-sided prediction interval to contain the mean of the concentration readings
of ﬁve additional specimens randomly selected from the same population. For n = 5, m = 5,
and 1 − α = 0.95, the factor is 1.76 and the interval is:

28.4 ± 1.76(1.18) = [26.3, 30.5]

We are 95% conﬁdent that the mean of the readings of ﬁve additional concentrations will be
in the interval 26.3 to 30.5 µg/L.

There are two sources of imprecision in statistical prediction. First, because the given data are limited,
there is uncertainty with respect to the parameters of the previously sampled population. Second, there
is random variation in the future sample. Say, for example, that the results of an initial sample of size
n from a normal population with unknown mean η and unknown standard deviation σ are used to predict
the value of a single future randomly selected observation from the same population. The mean of the
y
initial sample is used to predict the future observation. Now y = η + e, where e, the random variation
2
associated with the mean of the initial sample, is normally distributed with mean 0 and variance σ /n.
The future observation to be predicted is y f = η + e f , where e f is the random variation associated with
2
the future observation, normally distributed with mean 0 and variance σ . Thus, the prediction error is
2
2
y f – y = e f – e, which has variance σ + (σ /n). The length of the prediction interval to contain y f will
2 2
be proportional to σ + (σ /n). Increasing the initial sample will reduce the imprecision associated
with the sample mean (i.e., σ /n), but it will not reduce the sampling error in the estimate of they 2
2
variation (σ ) associated with the future observations. Thus, an increase in the size of the initial sample
size beyond the point where the inherent variation in the future sample tends to dominate will not
materially reduce the length of the prediction interval.
A conﬁdence interval to contain a population parameter converges to a point as the sample size
increases. A prediction interval converges to an interval. Thus, it is not possible to obtain a prediction
interval consistently shorter than some limiting interval, no matter how large an initial sample is taken
(Hahn and Meeker, 1991).

Statistical Interval for the Standard Deviation of a Normal Distribution

Conﬁdence and prediction intervals for the standard deviation of a normal distribution can be calculated
2
using factors from Table 21.3. The factors are based on the χ distribution and are asymmetric. They
are multipliers and the intervals have the form:

[k 1 s, k 2 s]

174 175 176 177 178 179 180 181 182 183 184