Page 176 - Statistics for Environmental Engineers
P. 176
L1592_frame_C21 Page 175 Tuesday, December 18, 2001 2:43 PM
21
Tolerance Intervals and Prediction Intervals
KEY WORDS confidence interval, coverage, groundwater monitoring, interval estimate, lognormal
distribution, mean, normal distribution, point estimate, precision, prediction interval, random sampling,
random variation, spare parts inventory, standard deviation, tolerance coefficient, tolerance interval,
transformation, variance, water quality monitoring.
Often we are interested more in an interval estimate of a parameter than in a point estimate. When told
that the average efficiency of a sample of eight pumps was 88.3%, an engineer might say, “The point
estimate of 88.3% is a concise summary of the results, but it provides no information about their
precision.” The estimate based on the sample of 8 pumps may be quite different from the results if a
different sample of 8 pumps were tested, or if 50 pumps were tested. Is the estimate 88.3 ± 1%, or 88.3
± 5%? How good is 88.3% as an estimate of the efficiency of the next pump that will be delivered? Can
we be reasonably confident that it will be within 1% or 10% of 88.3%?
Understanding this uncertainty is as important as making the point estimate. The main goal of statistical
analysis is to quantify these kinds of uncertainties, which are expressed as intervals.
The choice of a statistical interval depends on the application and the needs of the problem. One must
decide whether the main interest is in describing the population or process from which the sample has
been selected or in predicting the results of a future sample from the same population. Confidence
intervals enclose the population mean and tolerance intervals contain a specified proportion of a
population. In contrast, intervals for a future sample mean and intervals to include all of m future
observations are called prediction intervals because they deal with predicting (or containing) the results
of a future sample from a previously sampled population (Hahn and Meeker, 1991).
Confidence intervals were discussed in previous chapters. This chapter briefly considers tolerance
intervals and prediction intervals.
Tolerance Intervals
A tolerance interval contains a specified proportion (p) of the units from the sampled population or
process. For example, based upon a past sample of copper concentration measurements in sludge, we
might wish to compute an interval to contain, with a specified degree of confidence, the concentration
of at least 90% of the copper concentrations from the sampled process. The tolerance interval is
constructed from the data using two coefficients, the coverage and the tolerance coefficient. The coverage
is the proportion of the population (p) that an interval is supposed to contain. The tolerance coefficient
is the degree of confidence with which the interval reaches the specified coverage. A tolerance interval
with coverage of 95% and a tolerance coefficient of 90% will contain 95% of the population distribution
with a confidence of 90%.
The form of a two-sided tolerance interval is the same as a confidence interval:
y ± K 1−α,p,n s
where the factor K 1−α,p,n has a 100(1 − α)% confidence level and depends on n, the number of observations
in the given sample. Table 21.1 gives the factors (t n−1,α/2 / n) for two-sided 95% confidence intervals
© 2002 By CRC Press LLC