Page 38 - Statistics for Environmental Engineers
P. 38

L1592_frame_C03  Page 29  Tuesday, December 18, 2001  1:41 PM








                       BODs in the 1980s are not as high as in the past. This reduction is what has improved the fishery, because
                       the highest BODs were occurring in the summer when stream flow was minimal and water temperature
                       was high. Several kinds of plots were needed to extract useful information from these data. This is often
                       the case with environmental data.



                       Showing Statistical Variation and Precision

                       Measurements vary and one important function of graphs is to show the variation. There are three very
                       different ways of showing variation: a histogram, a box plot (or box-and-whisker plot), and with error
                       bars that represent statistics such as standard deviations, standard errors, or confidence intervals.
                        A histogram shows the shape of the frequency distribution and the range of values; it also gives an
                       impression of central tendency and shows symmetry or lack of it. A box plot is a designed to convey a
                       few primary features of a set of data. One form of box plot, the so-called box-and-whisker plot, is used
                       in Figure 3.7 to compare the effluent quality of 12 identical trickling filter pilot plants that received the
                       same influent and were operated in parallel for 35 weeks (Gameson, 1961). It shows the median (50th
                       percentile) as a center bar, and the quartiles (25th and 75th percentiles) as a box. The box covers the
                       middle 50% of the data; this 50% is called the interquartile range. Plotting the median instead of the
                       average has this advantage: the median is not affected by the extreme values. The “whiskers” cover all
                       but the most extreme values in the data set (the whiskers are explained in Cleveland, 1990, 1994). Extreme
                       values beyond the whiskers are plotted as individual points. If the data come from a normal distribution,
                       the fraction of observations expected to lie beyond the whiskers is slightly more than 1%. The simplicity
                       of the plot makes a convenient comparison of the performance of the 12 replicate filters.
                        Figure 3.8 summarizes and compares the trickling filter data of Figure 3.7 by showing the average
                       with error bars that are plus and minus two standard errors (the standard error is an estimate of the
                       standard deviation of the average). This has some weaknesses. The standard error bars are symmetrical
                       about the average, which may lead the viewer to assume that the data are also distributed symmetrically
                       about the mean. Figure 3.7 showed that this is not the case. Also, Figure 3.8 makes the 12 trickling
                       filters appear more different than Figure 3.7 does. This happens because in a few cases the averages are


                                                                  ¥  ¥
                                                                        ¥  ¥¥


                                                                     ¥¥



                                                           ¥




                                                               ¥ ¥  ¥  ¥  ¥  ¥
                                                         ¥ ¥
                                                         ¥        ¥  ¥
                                                                  ¥  ¥
                                             0           5           10         15          20         25
                                                       Effluent BOD (mg/L)


                       FIGURE 3.7 Box-and-whisker plots to compare the performance of 12 identical trickling filters operating in parallel. Each
                       panel summarizes 35 measurements.
                       © 2002 By CRC Press LLC
   33   34   35   36   37   38   39   40   41   42   43