Page 199 - Statistics for Environmental Engineers

P. 199

l1592_frame_Ch23 Page 199 Tuesday, December 18, 2001 2:44 PM

TABLE 23.1
Reduction in the Width of the 95% Conﬁdence Interval (α = 0.05) as the Sample
Size is Increased, Assuming E = t α/2 s/ n
n 2 3 4 5 8 10 15 20 25
4.30 3.18 2.78 2.57 2.31 2.23 2.13 2.09 2.06
t α/2
n 1.41 1.73 2.00 2.2 2.8 3.2 3.9 4.5 5.0
E = t α/2 s/ n 3.0s 1.8s 1.4s 1.2s 0.8s 0.7s 0.55s 0.47s 0.41s

we assume that the system will not change during the next phase of sampling. This can be checked as
data are collected and the sampling plan can be revised if necessary.
For smaller sample sizes, say n < 30, and assuming that the distribution of the sample mean is
approximately normal, the conﬁdence interval half-width is E = t α/2 s/ n and we can assert with (1 – α)
100% conﬁdence that E is the maximum error made in using to estimate η.y
The value of t decreases as n increases, but there is little change once n exceeds 5, as shown in Table
23.1. The greatest gain in narrowing the conﬁdence interval comes from the decrease in 1/ n and not in the
decrease in t. Doubling n decreases the size of conﬁdence interval by a factor of 1/ 2 when the sample is
large (n > 30). For small samples the gain is more impressive. For a stated level of conﬁdence, doubling the
size from 5 to 10 reduces the half-width of the conﬁdence by about one-third. Increasing the sample size
from 5 to 20 reduces the half-width by almost two-thirds.
An exact solution of the sample size for small n requires an iterative solution, but a good approximate
solution is obtained by using a rounded value of t = 2.1 or 2.2, which covers a good working range of
n = 10 to n = 25. When analyzing data we carry three decimal places in the value of t, but that kind of
accuracy is misplaced when sizing the sample. The greatest uncertainty lies in the value of the speciﬁed
s, so we can conveniently round off t to one decimal place.
Another reason not to be unreasonably precise about this calculation is that the sample size you calculate
will usually be rounded up, not just to the next higher integer, but to some even larger convenient number.
If you calculate a sample size of n = 26, you might well decide to collect 30 or 35 specimens to allow for
breakage or other loss of information. If you ﬁnd after analysis that your sample size was too small, it is
expensive to go back to collect more experimental material, and you will ﬁnd that conditions have shifted
and the overall variability will be increased. In other words, the calculated n is guidance and not a limitation.

Example 23.1

We wish to estimate the mean of a process to within ten units of the true value, with 95% conﬁdence.
Assuming that a large sample is needed, use:

 z α /2 σ  2
n =  ------------
E 

Ten random preliminary measurements [233, 266, 283, 233, 201, 149, 219, 179, 220, and 214]
y
give = 220 and s = 38.8. Using s as an estimate of σ and Ε = 10:
(
 1.96 38.8)  2
n =  --------------------------  ≈ 58
10
Example 23.2

A monitoring study is intended to estimate the mean concentration of a pollutant at a sewer
monitoring station. A preliminary survey consisting of ten representative observations gave [291,
320, 140, 223, 219, 195, 248, 251, 163, and 292]. The average is = 234.2 and the sample standard
y
© 2002 By CRC Press LLC

194 195 196 197 198 199 200 201 202 203 204