Page 50 - Statistics for Environmental Engineers
P. 50

L1592_Frame_C04  Page 42  Tuesday, December 18, 2001  1:41 PM









                       For example, the BOD discharged into a freely flowing stream is important the day it is discharged. A
                       2- or 3-day average might also be important because a few days of dissolved oxygen depression could
                       be disastrous while one day might be tolerable to aquatic organisms. A 30-day average of BOD could
                       be a less informative statistic about the threat to fish than a short-term average, but it may be needed to
                       assess the long-term trend in treatment plant performance.
                        For suspended solids that settle on a stream bed and form sludge banks, a long-term average might
                       be related to depth of the sludge bed and therefore be an informative statistic. If the solids do not settle,
                       the daily values may be more descriptive of potential damage. For a pollutant that could be ingested by
                       an organism and later excreted or metabolized, the exponentially weighted moving average might be a
                       good statistic.
                        Conversely, some pollutants may not exhibit their effect for years. Carcinogens are an example where
                       the long-term average could be important. Long-term in this context is years, so the 30-day average would
                       not be a particularly useful statistic. The first ingested (or inhaled) irritants may have more importance
                       than recently ingested material. If so, perhaps past events should be weighted more heavily than recent
                       events if a statistic is to relate source of pollution to present effect. Choosing a statistic with the
                       appropriate weighting could increase the value of the data to biologists, epidemiologists, and others who
                       seek to relate pollutant discharges to effects on organisms.






                       Plotting on a Logarithmic Scale
                       The top panel of Figure 4.1 is a plot of influent copper concentration at a wastewater treatment plant.
                       This plot emphasizes the few high values, expecially those at days 225, 250, and 340. The bottom panel
                       shows the same data on a logarithmic scale. Now the process behavior appears more consistent. The
                       low values are more evident, and the high values do not seem so extreme. The episode around day 250
                       still looks unusual, but the day 225 and 340 values are above the average (on the log scale) by about
                       the same amount that the lowest values are below average.
                        Are the high values so extraordinary as to deserve special attention? Or are they rogue values (outliers)
                       that can be disregarded? This question cannot be answered without knowing the underlying distribution
                       of the data. If the underlying process naturally generates data with a lognormal distribution, the high
                       values fit the general pattern of the data record.





                                      1000
                                     Copper (mg/L)  500




                                         0
                                     10000
                                     Copper (mg/L)  1000



                                       100
                                        10
                                          0     50    100   150    200   250    300   350
                                                               Days

                       FIGURE 4.1 Copper data plotted on arithmetic and logarithmic scales give a different impression about the high values.
                       © 2002 By CRC Press LLC
   45   46   47   48   49   50   51   52   53   54   55