Page 87 - Becoming Metric Wise

P. 87

77
Statistics

4.3.4 The Mode
When dealing with nominal data (classes are just names with no inherent
order, such as names of scientists, or countries) or in cases where only a
small number of values is possible, one may use the mode as a measure of
central tendency. The mode is the value that occurs the most. In the case
of classes there may be a modal class this is the class occurring with the
highest frequency. In such cases, the midpoint of the modal class is the
mode.

4.4 CUMULATIVE DISTRIBUTIONS AND THE QUANTILE
FUNCTION
4.4.1 The Observed Cumulative Distribution

The value of an observed or empirical cumulative distribution in a point
x, given the n observations x 1 , .. ., x n , is the number of observations x j
smaller than x, divided by the total number of observations, n. This
observed cumulative distribution will be denoted as b F n ðxÞ. The index n
refers to the number of data (not necessarily different ones) and the “hat”
refers to the fact that it is an observed rather than a theoretical distribu-
tion. A cumulative distribution can be represented as an increasing step
function.
Suppose we have observed the values 1, 2, 3, and 4, each exactly
once. What is the observed cumulative distribution b F n ðxÞ? The number 1
is the smallest value we have observed. Hence, F n ðxÞ 5 0 for x , 1. We
b
see that b F n ð1Þ 5 1=4 and for every x in [1, 2]. b F n ðxÞ stays equal to /4.
1
Clearly, b F n ð2Þ 5 2=4, b F n ð3Þ 5 3=4 and b F n ð4Þ 5 1. This empirical function
is constant on each half-open interval between these points and for x $ 4
b F n ðxÞ 5 1. Fig. 4.6 illustrates the result of this procedure.

4.4.2 The Quantile Function
Now we ask: Given the percentage p, which real number x has p as its
cumulative frequency? Consider for example, p 5 0.25, then we are inter-
ested in that observation such that 25% of all observations is smaller than
or equal to x. For p 5 0.5 we are interested in the observation situated in
the middle. In mathematical terminology, this is the problem of finding
an inverse function of the cumulative distribution function. However, as
for empirical [ 5 observed] distributions, this function is a step function
and hence not one-to-one (not injective), one has to agree on a precise

82 83 84 85 86 87 88 89 90 91 92