Page 31 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 31

10       1 Introduction


              Several statistics, whose only assumption is the existence of a total order
           relation, can be applied to ordinal data. One such statistic is the median, as shown
           in Example 1.2.
              Continuous variables have a real number interval (or a reunion of intervals) as
           domain, which is unique up to a linear transformation. One can further distinguish
           between ratio type variables, supporting linear transformations of the y = ax type,
           and interval type variables supporting linear transformations of the y = ax + b type.
           The domain of ratio type variables has a fixed zero. This is the most frequent type
           of continuous variables encountered, as in Example 1.3 (a zero ohm resistance is a
           zero  resistance in whatever  measurement  scale we choose to elect). The whole
           panoply of statistics is supported  by continuous  ratio type variables. The less
           common interval type variables do not have a fixed zero. An example of interval
           type data is temperature data, which can either be measured in degrees Celsius (X C)
           or in degrees Fahrenheit (X F), satisfying the relation X F = 1.8X C + 32. There are
           only a few, less frequent statistics, requiring a fixed zero, not supported by this
           type of variables.
              Notice that, strictly speaking, there is no such thing as continuous data, since all
           data  can only be  measured with finite precision.  If,  for example,  one is  dealing




           with   data representing people’s height in meters, “real-flavour” numbers such as
           1.82 m  may  be used.  Of course, if the  highest measurement precision is the
           millimetre, one is in fact dealing with integer numbers such as 182 mm, i.e., the
           height data is, in fact, ordinal data. In practice, however, one often assumes that
           there is a continuous domain underlying the ordinal data. For instance, one often
           assumes that the height data can be measured with arbitrarily high precision. Even
           for rank data such as the examination scores of Example 1.2, one often computes
           an average score, obtaining a value in the continuous interval [0, 5], i.e., one is
           implicitly assuming that the  examination  scores  can be measured  with  a  higher
           precision.


           1.4  Probabilities and Distributions


           The process of statistically analysing a dataset involves operating with an
           appropriate  measure expressing the  randomness exhibited by the  dataset. This
           measure is the probability measure. In this section, we will introduce a few topics
           of  Probability Theory that  are needed for the  understanding  of the  following
           material. The reader familiar with Probability Theory can skip this section. A more
           detailed survey (but still a  brief  one)  on Probability Theory can  be  found in
           Appendix A.



           1.4.1 Discrete Variables
           The beginnings of Probability Theory can be traced far back in time to studies on
           chance games. The work of the Swiss mathematician Jacob Bernoulli (1654-1705),
           Ars Conjectandi, represented a keystone in the development of a  Theory of
   26   27   28   29   30   31   32   33   34   35   36