Page 56 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 56

2.1 Preliminaries   35


           are interested  in defining a  new variable,  PClass, that categorises the  maximum
           rain precipitation (variable PMax) into three categories:

              1.  PMax ≤ 20 (low);
              2.  20 < PMax ≤ 80 (moderate);
              3.  PMax > 80 (high).

              Variable PClass can be expressed as

              PClass = 1 + (PMax > 20) + (PMax > 80),

           whenever logical values associated to relational expressions such as “PMax > 20”
           are represented  by the arithmetical values  0 and 1, coding  False and  True,
           respectively. That is precisely how SPSS, STATISTICA, MATLAB and R handle
           such expressions. The reader can easily check that PClass values are 1, 2 and 3 in
           correspondence with the low, moderate and high categories.
              In the following subsections we will learn the essentials of data operation with
           SPSS, STATISTICA, MATLAB and R.


           2.1.2.1 SPSS

           The addition of a  new  variable is made in SPSS  by using the  Insert
           Variable   option of the D ata   menu. In the case of the previous categorisation
           variable, one would then proceed to compute its values by using the Compute
           option of the Transform   menu. The Compute Variable   window shown in
           Figure 2.6 will then be displayed, where one would fill in the above formula using
           the respective variable identifiers; in this case:  1+(pmax>20)+(pmax>80)  .
              Looking to Figure 2.6 one may rightly suspect that a large number of functions
           are available in SPSS for building arbitrarily complex formulas.
              Other data management operations such  as sorting and transposing can  be
           performed using specific options of the SPSS  Data   menu.


           2.1.2.2 STATISTICA

           The addition of a  new  variable in STATISTICA is made with the  Add
           Variable   option  of the  Insert    menu. The  variable specification window
           shown in Figure 2.7 will then be displayed, where one would fill in, namely, the
           number of variables to be added, their names and the formulas used to compute
           them. In this case, the formula is:

              1+(v1>20)+(v1>80)   .

              In STATISTICA variables are symbolically denoted by v  followed by a number
           representing the position of the variable column in the spreadsheet. Since Pmax
           happens to be the first column, it is then denoted v1  . The cases column is v0  . It is
           also possible to use variable identifiers in formulas instead of  -notations.
                                                             v
   51   52   53   54   55   56   57   58   59   60   61