Page 81 - Statistics for Environmental Engineers
P. 81

L1592_frame_C08  Page 73  Tuesday, December 18, 2001  1:45 PM









                           The 99th quantile of the lognormal distribution is found by making the transformation in reverse:

                                                y ˆ  p  =  antilog x ˆ (  p ) =  exp  x ˆ (  p ) =  45.9.
                       An upper 100(1–α)% confidence limit for the true pth quantile, y p , can be easily obtained if the underlying
                       distribution is normal (or has been transformed to become normal). This upper confidence limit is:
                                                    UCL 1−α y p =  y +  K 1–α p, s
                                                          ()
                       where K 1–α,p  is obtained from a table by Owen (1972), which is reprinted in Gilbert (1987).

                       Example 8.3

                                                                              y
                           From n = 300 normally distributed observations we have calculated   = 10.0 and s = 1.2. The
                                                   = 10 + 2.326(1.2) = 12.79. For n = 300, 1 – α = 0.95, and p =
                           estimated 99th quantile is  y ˆ 0.99
                           0.99, K 0.95,0.99  = 2.522 (from Gilbert, 1987) and the 95% upper confidence limit for the true 99th
                           percentile value is:
                                               UCL 0.95 (y 0.99 ) = 10 + (1.2)(2.522) = 13.0.
                           In summary, the best estimate of the 99th quantile is 12.79 and we can state with 95% confidence
                           that its true value is less than 13.0.
                        Sometimes one is asked to estimate a 99th percentile value and its upper confidence limit from samples
                       sizes that are much smaller than the n = 300 used in this example. Suppose that we have   = 10, s = 1.2,y
                       and n = 30, which again gives  y ˆ 0.99 =  12.8.   Now, K 0.95,0.99  = 3.064 (Gilbert, 1987) and:

                                              UCL 0.95 (y 0.99 ) = 10 + (1.2)(3.064) = 13.7
                       compared with UCL of 13.0 in Example 8.3. This 5% increase in the UCL has no practical importance.
                       A potentially greater error resides in the assumption that the data are normally distributed, which is
                       difficult to verify with a sample of n = 30. If the assumed distribution is wrong, the estimated p 0.99  is
                       badly wrong, although the confidence limit is quite small.



                       Nonparametric Estimates of Quantiles

                       Nonparametric estimation methods do not require a distribution to be known or assumed. They apply
                       to all distributions and can be used with any data set. There is a price for being unable (or unwilling)
                       to make a constraining assumption regarding the population distribution. The estimates obtained by these
                       methods are not as precise as we could obtain with a parametric method. Therefore, use the nonpara-
                       metric method only when the underlying distribution is unknown or cannot be transformed to make it
                       become normal.
                        The data are ordered from smallest to largest just as was done to construct a probability plot (Chapter 5).
                       Percentile estimates could be read from a probability plot. The method to be illustrated here skips the
                                                                                                    y
                       plotting (but with a reminder that plotting data is always a good idea). The estimated pth quantile,  , p )
                       is simply the kth largest datum in the set, where k = p(n + 1), n is the number of data points, and p is
                       the quantile level of interest. If k is not an integer, y p  is obtained by linear interpolation between the two
                       closest ordered values.

                       Example 8.4

                           A sample of n = 575 daily BOD observations is available to estimate the 99th percentile by the
                           nonparametric method for the purpose of setting a maximum limit in a paper mill’s discharge
                           permit. The 11 largest ranked observations are:
                       © 2002 By CRC Press LLC
   76   77   78   79   80   81   82   83   84   85   86