Page 77 - Becoming Metric Wise
P. 77

68    Becoming Metric-Wise


          specified level of confidence). In informetrics there often are no samples, but
          one tries to draw conclusions based on an observed population e.g., all jour-
          nals included in Scopus. Does this make sense? We will not answer this ques-
          tion, but refer to Section 4.14 for some useful references related to this
          question.
             This chapter is subdivided into two main parts. In Part A, we describe
          some techniques from descriptive statistics, while in Part B we discuss
          inferential statistics, including a short introduction to the normal distribu-
          tion and a few nonparametric tests. Although we will not consider
          parametric tests, we nevertheless briefly introduce the normal distribution
          as it is near-impossible to talk about statistics without in some way involv-
          ing the normal distribution e.g., when talking about z-scores. Multivariate
          techniques are beyond the scope of this introductory book. For these we
          refer the reader to specialized literature.

          PART A. DESCRIPTIVE STATISTICS
          4.2 SIMPLE REPRESENTATIONS

          4.2.1 Nominal Categories
          If a sample or the whole population can only be subdivided into groups
          without any relation, we have nominal categories. The only measurement
          we can perform is counting how many items there are in each category.
          For example, when describing the journals used by scientist S to publish
          his research results over a certain period of time we have nominal data
          such as: 2 articles are published in journal J 1 , 1 article in journal J 2 ,10
          articles in J 3 and 5 articles in J 4 . The number of articles published by the
          authors of different countries during a given period of time is the result
          of a counting activity performed on nominal data (countries). A binary
          scale (yes-no) is a special case of a nominal scale.
             Counting results for nominal data can be represented by bar diagrams.
          Categories c j , j 5 1, .. ., k are represented on the horizontal axis, while the
          height of the corresponding bar is proportional to the absolute or relative
          frequency (which is the same here) of the items in category c j .Itdoes not
          matter in which order categories are shown on the horizontal axis, but
          placing them according to the obtained counts in an obvious option.
             For example, if we consider the number of publications in the year 2000
          included in the Web of Science with at least one author from England,
          Northern Ireland (the Web of Science writes North Ireland), Scotland and
          Wales, we may represent these by the following bar diagram (Fig. 4.1).
   72   73   74   75   76   77   78   79   80   81   82