Page 310 - Becoming Metric Wise
P. 310
302 Becoming Metric-Wise
larger than or equal to 1. This power law is called Lotka’s law because the
mathematician, physical chemist and specialist in population dynamics
Alfred Lotka first formulated this law in 1926 in the context of authors
(considering only first authors) and the number of articles they had written
(Lotka, 1926). More precisely, Lotka used two data sets. One consisted of
the publication of authors whose name began with A or B and whose pub-
lications were included in Chemical Abstracts (1907 1916); the other one
were articles by physicists included in Auerbach’s Geschichtstafeln der Physik
of 1910. Using an estimation procedure based on linear regression on log-
log scale (and after removing some outliers) he obtained an α value of 2.02
for Auerbach’s data and an α value of 1.89 for the Chemical Abstracts data.
As a first approximation one may say that at least for his data α 2.
Consequently, in many modelling exercises one takes α 5 2. This value is
also of interest for another reason. Experience has shown that α 5 2is a
real turning point for several Lotkaian properties. Examples of such phe-
nomena—of a more advanced nature—can be found in (Egghe, 2005) and
further in Subsection 9.4.2.
The Lotka function (9.7) describes a highly elitarian situation. Indeed,
f(1) 5 C, and if α 5 2 it can be shown that the percentage of authors with
just one article is equal to 60.79%. This result clearly illustrates that
“many sources have few items.”
The rank-frequency function corresponding with the Lotka function
is a function known as Zipf’s law:
B
grðÞ 5 (9.8)
r β
(B,β . 0). This function too is a power function, but note that here
the variable, r, denotes a rank. Observe that this function is injective as
required in formula (9.6). Indeed, the inverse of g, denoted as g 21 is
B
21
g ðÞ 5 1=β . If one applies a function and then its inverse then one
s
s
must obtain the identity function. Using the mathematical standard nota-
tion ˚ to denote the composition of functions (applying one function after
the other), we check:
B 1=β B 1=β B 1=β
21
21
ðg 3 gÞðrÞ 5 g ðgðrÞÞ 5 1=β 5 1=β 5 5 r (9.9)
:r
ðgðrÞÞ ð B:r 2β Þ B 1=β 21
If β 5 1, Zipf’s law can be formulated as: The product of the rank order
of an author (originally: word type) and his number of articles (originally,
number of occurrences or tokens) is a constant for a given database