Page 81 - Statistics for Environmental Engineers

P. 81

L1592_frame_C08 Page 73 Tuesday, December 18, 2001 1:45 PM

The 99th quantile of the lognormal distribution is found by making the transformation in reverse:

y ˆ p = antilog x ˆ ( p ) = exp x ˆ ( p ) = 45.9.
An upper 100(1–α)% conﬁdence limit for the true pth quantile, y p , can be easily obtained if the underlying
distribution is normal (or has been transformed to become normal). This upper conﬁdence limit is:
UCL 1−α y p = y + K 1–α p, s
()
where K 1–α,p is obtained from a table by Owen (1972), which is reprinted in Gilbert (1987).

Example 8.3

y
From n = 300 normally distributed observations we have calculated = 10.0 and s = 1.2. The
= 10 + 2.326(1.2) = 12.79. For n = 300, 1 – α = 0.95, and p =
estimated 99th quantile is y ˆ 0.99
0.99, K 0.95,0.99 = 2.522 (from Gilbert, 1987) and the 95% upper conﬁdence limit for the true 99th
percentile value is:
UCL 0.95 (y 0.99 ) = 10 + (1.2)(2.522) = 13.0.
In summary, the best estimate of the 99th quantile is 12.79 and we can state with 95% conﬁdence
that its true value is less than 13.0.
Sometimes one is asked to estimate a 99th percentile value and its upper conﬁdence limit from samples
sizes that are much smaller than the n = 300 used in this example. Suppose that we have = 10, s = 1.2,y
and n = 30, which again gives y ˆ 0.99 = 12.8. Now, K 0.95,0.99 = 3.064 (Gilbert, 1987) and:

UCL 0.95 (y 0.99 ) = 10 + (1.2)(3.064) = 13.7
compared with UCL of 13.0 in Example 8.3. This 5% increase in the UCL has no practical importance.
A potentially greater error resides in the assumption that the data are normally distributed, which is
difﬁcult to verify with a sample of n = 30. If the assumed distribution is wrong, the estimated p 0.99 is
badly wrong, although the conﬁdence limit is quite small.

Nonparametric Estimates of Quantiles

Nonparametric estimation methods do not require a distribution to be known or assumed. They apply
to all distributions and can be used with any data set. There is a price for being unable (or unwilling)
to make a constraining assumption regarding the population distribution. The estimates obtained by these
methods are not as precise as we could obtain with a parametric method. Therefore, use the nonpara-
metric method only when the underlying distribution is unknown or cannot be transformed to make it
become normal.
The data are ordered from smallest to largest just as was done to construct a probability plot (Chapter 5).
Percentile estimates could be read from a probability plot. The method to be illustrated here skips the
y
plotting (but with a reminder that plotting data is always a good idea). The estimated pth quantile, , p )
is simply the kth largest datum in the set, where k = p(n + 1), n is the number of data points, and p is
the quantile level of interest. If k is not an integer, y p is obtained by linear interpolation between the two
closest ordered values.

Example 8.4

A sample of n = 575 daily BOD observations is available to estimate the 99th percentile by the
nonparametric method for the purpose of setting a maximum limit in a paper mill’s discharge
permit. The 11 largest ranked observations are:
© 2002 By CRC Press LLC

76 77 78 79 80 81 82 83 84 85 86