Page 198 - Statistics for Environmental Engineers
P. 198

l1592_frame_Ch23  Page 198  Tuesday, December 18, 2001  2:44 PM






                       convincing conclusions, but not too small. If the standard error is large, the experiment is worthless, but
                       resources have been wasted if it is smaller than necessary.
                        In a paired t-test, each pair is a block that is not affected by nuisance factors that change during the
                       time between runs. Each pair provides one estimate of the difference between the treatments being
                       compared. If we have only one pair, we can estimate the average difference but we can say nothing
                       about the precision of the estimate because we have no degrees of freedom with which to estimate the
                       experimental error. Making two replicates (two pairs) is an improvement, and going to four pairs is a
                                                                        2
                       big improvement. Suppose the variance of each difference is σ . If we run two replicates (two pairs),
                       the approximate 95% confidence interval would be  2σ / 2±   = ±1.4σ. Four replicates would reduce the
                       confidence interval to  2σ / 4±   = ±σ. Each quadrupling of the sample size reduces the standard error
                       and the confidence interval by half.
                        Two-level factorial experiments, mentioned in the previous chapter as an efficient way to investigate
                       several factors at one time, incorporate the effect of replication. Suppose that we investigate three factors
                       by setting each at two levels and running all eight possible combinations, giving an experiment with n = 8
                       runs. From these eight runs we get four independent estimates of the effect of each factor. This is like having
                       a paired experiment repeated four times for factor A, four times for factor B, and four times for factor C.
                       Each measurement is doing triple duty. In short, we gain a benefit similar to what we gain from replication,
                       but without actually repeating any tests. It is better, of course, to actually repeat some (or all) runs because
                       this will reduce the standard error of the estimated effects and allow us to detect smaller differences. If each
                       test condition were repeated twice, the n = 16 run experiment would be highly informative.
                        Halving the standard error is a big gain. If the true difference between two treatments is one standard
                       error, there is only about a 17% chance that it will be detected at a confidence level of 95%. If the true
                       difference is two standard errors, there is slightly better than a 50/50 chance that it will be identified as
                       statistically significant at the 95% confidence level.
                        We now see the dilemma for the engineer and the statistical consultant. The engineer wants to detect
                       a small difference without doing many replicates. The statistician, not being a magician, is constrained
                       to certain mathematical realities.  The consultant will be most helpful at the planning stages of an
                       experiment when replication, randomization, blocking, and experimental design (factorial, paired test,
                       etc.) can be integrated.
                        What follows are recipes for a few simple situations in single-factor experiments. The theory has been
                       mostly covered in previous chapters.



                       Confidence Interval for a Mean
                       The (1 – α)100% confidence interval for the mean η has the form y ±  E,  where E is the half-length E =
                       z α/2 σ / n.  The sample size n that will produce this interval half-length is:

                                                         n =    z α /2 σ   2
                                                              ------------
                                                               E 
                       The value obtained is rounded to the next highest integer. This assumes random sampling. It also assumes
                       that n is large enough that the normal distribution can be used to define the confidence interval. (For
                       smaller sample sizes, the t distribution is used.)
                        To use this equation we must specify E, α or 1 – α, and σ. Values of 1 – α that might be used are:

                                 1 − α = 0.997  1 − α = 0.99  1 − α = 0.955  1 − α = 0.95  1 − α = 0.90
                                    z = 3.0      z = 2.56    z = 2.0      z = 1.96    z = 1.64

                       The most widely used value of 1 – α is 0.95 and the corresponding value of z = 1.96. For an approximate
                                                                       2  2
                       95% confidence interval, use z = 2 instead of 1.96 to get n = 4σ /E .  This corresponds to 1 – α  = 0.955.
                        The remaining problem is that the true value of σ is unknown, so an estimate is substituted based on
                       prior data of a similar kind or, if necessary, a good guess. If the estimate of σ is based on prior data,
                       © 2002 By CRC Press LLC
   193   194   195   196   197   198   199   200   201   202   203