Page 310 - Becoming Metric Wise
P. 310

302   Becoming Metric-Wise


          larger than or equal to 1. This power law is called Lotka’s law because the
          mathematician, physical chemist and specialist in population dynamics
          Alfred Lotka first formulated this law in 1926 in the context of authors
          (considering only first authors) and the number of articles they had written
          (Lotka, 1926). More precisely, Lotka used two data sets. One consisted of
          the publication of authors whose name began with A or B and whose pub-
          lications were included in Chemical Abstracts (1907 1916); the other one
          were articles by physicists included in Auerbach’s Geschichtstafeln der Physik
          of 1910. Using an estimation procedure based on linear regression on log-
          log scale (and after removing some outliers) he obtained an α value of 2.02
          for Auerbach’s data and an α value of 1.89 for the Chemical Abstracts data.
          As a first approximation one may say that at least for his data α   2.
          Consequently, in many modelling exercises one takes α 5 2. This value is
          also of interest for another reason. Experience has shown that α 5 2is a
          real turning point for several Lotkaian properties. Examples of such phe-
          nomena—of a more advanced nature—can be found in (Egghe, 2005) and
          further in Subsection 9.4.2.
             The Lotka function (9.7) describes a highly elitarian situation. Indeed,
          f(1) 5 C, and if α 5 2 it can be shown that the percentage of authors with
          just one article is equal to 60.79%. This result clearly illustrates that
          “many sources have few items.”
             The rank-frequency function corresponding with the Lotka function
          is a function known as Zipf’s law:
                                             B
                                       grðÞ 5                          (9.8)
                                             r β
             (B,β . 0). This function too is a power function, but note that here
          the variable, r, denotes a rank. Observe that this function is injective as
          required in formula (9.6). Indeed, the inverse of g, denoted as g 21  is
                   B
           21
          g ðÞ 5     1=β . If one applies a function and then its inverse then one
              s
                   s
          must obtain the identity function. Using the mathematical standard nota-
          tion ˚ to denote the composition of functions (applying one function after
          the other), we check:
                                    B 1=β       B 1=β      B 1=β
                         21
               21
             ðg 3 gÞðrÞ 5 g ðgðrÞÞ 5   1=β  5      1=β  5        5 r   (9.9)
                                                             :r
                                   ðgðrÞÞ    ð B:r 2β Þ  B 1=β 21
             If β 5 1, Zipf’s law can be formulated as: The product of the rank order
          of an author (originally: word type) and his number of articles (originally,
          number of occurrences or tokens) is a constant for a given database
   305   306   307   308   309   310   311   312   313   314   315