Page 87 - Becoming Metric Wise
P. 87

77
                                                                   Statistics

              4.3.4 The Mode
              When dealing with nominal data (classes are just names with no inherent
              order, such as names of scientists, or countries) or in cases where only a
              small number of values is possible, one may use the mode as a measure of
              central tendency. The mode is the value that occurs the most. In the case
              of classes there may be a modal class this is the class occurring with the
              highest frequency. In such cases, the midpoint of the modal class is the
              mode.



              4.4 CUMULATIVE DISTRIBUTIONS AND THE QUANTILE
              FUNCTION
              4.4.1 The Observed Cumulative Distribution

              The value of an observed or empirical cumulative distribution in a point
              x, given the n observations x 1 , .. ., x n , is the number of observations x j
              smaller than x, divided by the total number of observations, n. This
              observed cumulative distribution will be denoted as b F n ðxÞ. The index n
              refers to the number of data (not necessarily different ones) and the “hat”
              refers to the fact that it is an observed rather than a theoretical distribu-
              tion. A cumulative distribution can be represented as an increasing step
              function.
                 Suppose we have observed the values 1, 2, 3, and 4, each exactly
              once. What is the observed cumulative distribution b F n ðxÞ? The number 1
              is the smallest value we have observed. Hence, F n ðxÞ 5 0 for x , 1. We
                                                         b
              see that b F n ð1Þ 5 1=4 and for every x in [1, 2]. b F n ðxÞ stays equal to /4.
                                                                            1
              Clearly, b F n ð2Þ 5 2=4, b F n ð3Þ 5 3=4 and b F n ð4Þ 5 1. This empirical function
              is constant on each half-open interval between these points and for x $ 4
              b F n ðxÞ 5 1. Fig. 4.6 illustrates the result of this procedure.


              4.4.2 The Quantile Function
              Now we ask: Given the percentage p, which real number x has p as its
              cumulative frequency? Consider for example, p 5 0.25, then we are inter-
              ested in that observation such that 25% of all observations is smaller than
              or equal to x. For p 5 0.5 we are interested in the observation situated in
              the middle. In mathematical terminology, this is the problem of finding
              an inverse function of the cumulative distribution function. However, as
              for empirical [ 5 observed] distributions, this function is a step function
              and hence not one-to-one (not injective), one has to agree on a precise
   82   83   84   85   86   87   88   89   90   91   92