Page 67 - Statistics for Environmental Engineers
P. 67

L1592_frame_C06  Page 59  Tuesday, December 18, 2001  1:43 PM











                       Setting Critical Levels
                       The reference distribution shows at a glance which values are exceptionally high or low. What is meant
                       by “exceptional” can be specified by setting critical decision levels that have a specified probability
                       value. For example, one might specify exceptional as the level that is exceeded p percent of the time.
                       The reference distribution for daily observations during stable operation (bottom panel in Figure 6.5)
                       is based on 1150 daily values representing stable performance. The critical upper 5% level cut is a BOD
                       concentration of 33 mg/L. This is found by summing the frequencies, starting from the highest BOD
                       observed during stable operation, until the accumulated percentage equals or exceeds 5%. In this case, the
                       probability that the BOD is 20 is P(BOD = 20) = 0.8%. Also, P(BOD = 19) = 0.8%, P(BOD = 18) = 1.6%,
                       and P(BOD = 17) = 1.6%. The sum of these percentages is 4.8%. So, as a practical matter, we can say
                       that the BOD exceeds 16 mg/L only about 5% of the time when operation is stable.
                        Upper critical levels can be set for the MA(7) reference distribution as well. The probability that a
                       7-day MA(7) of 14 mg/L or higher will occur when the treatment plant is stable is 4%. An MA(7)
                       greater than 13 mg/L serves warning that the process is performing poorly and may be upset. By definition,
                       5% of such warnings will be false alarms. A two-level warning system could be devised, for example,
                       by using the upper 1% and the upper 5% levels. The upper 1% level, which is about 16 mg/L, is a signal
                       that something is almost certainly wrong; it will be a false in only 1 out of 100 alerts.
                        There is a balance to be found between having occasional false alarms and no false alarms. Setting
                       a warning at the 5% level, or perhaps even at the 10% level, means that an operator is occasionally sent
                       to look for a problem when none exists. But it also means that many times a warning is given before a
                       problem becomes too serious and on some of these occasions action will prevent a minor upset from
                       becoming more serious. An occasional wild goose chase is the price paid for the early warnings.




                       Comments

                       Consider why the warning levels were determined empirically instead of by calculating the mean and
                       standard deviation and then using the normal distribution. People who know some statistics tend to think
                       of the bell-shaped, symmetrical normal distribution when they hear that “the mean is X and the standard
                       deviation is Y.” The words “mean” and “standard deviation” create an image of approximately 95% of
                       the values falling within two standard deviations of the mean.
                        A glance at Figure 6.6 reveals why this is an inappropriate image for the reference distribution of
                       moving  averages. The distributions are not symmetrical and, furthermore, they are truncated. These
                       characteristics are especially evident in the MA(30) distribution. By definition, the effluent BOD values
                       are never very high when operation is stable, so MA cannot take on certain high values. Low values of
                       the MA do not occur because the effluent BOD cannot be less than zero and values less than 2 mg/L
                       were not observed. The normal distribution, with its finite probability of values occurring far out on the
                       tails of the distribution (and even into negative values), would be a terrible approximation of the reference
                       distribution derived from the operating record.
                        The reference distribution for the daily values will always give a warning before the MA does. The
                       MA is conservative. It flattens one-day upsets, even fairly large ones, and rolls smoothly through short
                       intervals of minor disturbances without giving much notice. The moving average is like a shock absorber
                       on a car in that it smooths out the small bumps. Also, just as a shock absorber needs to have the right
                       stiffness, a moving average needs to have the right length of memory to do its job well. A 30-day MA is
                       an interesting statistic to plot only because effluent standards use a 30-day average, but it is too sluggish
                       to usefully warn of trouble. At best, it can confirm that trouble has existed. The seven-day average is more
                       responsive to change and serves as a better warning signal. Exponentially weighted moving averages (see
                       Chapter 4) are also responsive and reference distributions can be constructed for them as well.
                        Just as there is no reason to judge process performance on the basis of only one variable, there is no
                       reason to select and use only one reference distribution for any particular single variable. One statistic
                       and its reference distribution might be most useful for process control while another is best for judging
                       © 2002 By CRC Press LLC
   62   63   64   65   66   67   68   69   70   71   72