Page 76 - Becoming Metric Wise
P. 76

CHAPTER 4

              Statistics





              4.1 INTRODUCTION
              Statistical analysis can be subdivided into two parts descriptive statistics
              and inferential statistics. In descriptive statistics, one summarizes and
              graphically represents data of a sample or a whole population. In inferen-
              tial statistics, one not only collects numerical data as a sample from a pop-
              ulation but also analyzes it and, based on this analysis, draws conclusions
              with estimated uncertainties (i.e., by using probability theory) about the
              population. It goes without saying that in order to measure aspects of sci-
              entific communication and to evaluate scientific research, scientists use
              statistical techniques. Although hundreds of books have been written on
              statistics, few deal explicitly with statistics in the framework of informa-
              tion and library science. A basic introductory text for library professionals
              is Vaughan (2001), while Egghe and Rousseau (2001) is more elementary.
              One quarter of Introduction to Informetrics (Egghe & Rousseau, 1990) is
              devoted to statistics. Ding et al. (2014) contains a practical introduction to
              recent developments in informetrics, including statistical methods.
                 The term population refers to the set of entities (physical or abstract
              ones) about which one seeks information. The publications of scientists
              forming a research group, of scientists in a country, of scientists active in a
              scientific domain; of articles included in Scopus and published during the
              year 2015, are all examples of populations.
                 In order to investigate a population, the investigator collects data. If it
              is possible, the best option is to include the whole population in this
              investigation. Yet, it is often impossible to collect data on the whole pop-
              ulation, so the statistician collects a representative sample. This means that
              a subset is collected in such a way that it provides a miniature image of
              the whole population. If, moreover, the sample is large enough, then a
              diligent analysis of the sample will lead to conclusions that are, to a large
              extent, also valid for the whole population. Such conclusions must be
              reliable, which includes that the probability to be correct must be known.
                 Classical inferential statistics draws samples from a population and then
              tries to obtain conclusions that are valid for the whole population (with a


              Becoming Metric-Wise                         © 2018 Elsevier Ltd.
              DOI: http://dx.doi.org/10.1016/B978-0-08-102474-4.00004-2  All rights reserved.  67
   71   72   73   74   75   76   77   78   79   80   81