Page 98 - Becoming Metric Wise
P. 98

88    Becoming Metric-Wise


          perhaps an 85/15 ratio or a 70/30 one. Nevertheless, the 80/20 rule is a
          well-known rule between percentiles which shows, in a cogent way, how
          relatively small groups or causes are responsible or determine a much
          larger part of the consequences. We will next discuss a graphical tool to
          illustrate the exact relation.

          4.10.2 The Lorenz Curve
          Consider a set of N scientists and let X5 (x i , i5 1, ..., N)bethe sequence
          of numbers of publications they (co-) authored (using whole counting)
          during a given period of time. We assume that these authors are ranked from
          most active to least active. Let s j 5  P j  x i be the j-th partial sum and hence
                                          i51
          s N 5 TOT the total number of publications (with possible double counting
          if scientists collaborated) of this group of scientists; s 0 is set equal to 0. Now
                          k
          plot the points  ;  s k    and connect them by line segments to obtain
                        N TOT k50;:::;N
          a curve joining the origin (0,0) with the point (1,1). This curve is known as
          the Lorenz curve (Lorenz, 1905). If all scientists had published the same
          number of articles, the Lorenz curve would be a straight line. Otherwise the
          curve is concave and situated above this straight line. Fig. 4.13 provides an
          illustration of a Lorenz curve. We note that the Lorenz curve can be consid-
          ered a normalized cumulative relative frequency curve.
             It is now easy to read a 100y/100x rule from this curve, where one can
          choose 100y (80 for example). The more a Lorenz curve approaches the
          diagonal, the more balanced the situation represented by it. A real 80/20
          relation indicates a very unbalanced situation. Yet the point (0.2, 0.8) is just
          one point on a Lorenz curve (possibly) and hence the whole curve contains
          much more information than just giving a 100y/100x relation.
             We note, though, that Lorenz curves of two situations may intersect
          and then it is not immediately clear which of the two situations is the
          more balanced or the more unequal. In such cases one uses a measure of
          inequality. A simple one is the coefficient of variation, defined as the stan-
          dard deviation divided by the mean and denoted as V. Another measure
          of inequality is the Gini coefficient, discussed next.

          4.10.3 The Gini Coefficient (Also Known as the Gini Index)
          (Gini, 1909)
          We observed that the lower the Lorenz curve, the smaller the inequality.
          Hence once uses the area between the Lorenz curve and the diagonal as a
          measure of inequality. More precisely, the Gini index, denoted as g,is
   93   94   95   96   97   98   99   100   101   102   103