Page 55 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 55

34       2 Presenting and Summarising the Data


              It is often convenient to have appropriate column names for the data, instead of
           the default V1  , V2  , etc. One way to do this is to first create a string vector and pass
           it to the  rea d.table   function as a  co l.names   parameter value. For the
           meteo   data we could have:

              > l <- c(“PMax”, RainDays”, T80”, T81”, T82”)
                                 “
                                                     “
                                                            “
                                              “
              > meteo<-read.table(file(“e:meteo.txt”),col.names=l)
              > meteo
                 PMax RainDays T80 T81 T82
              1   181      143  36  39  37
              2   114      132  35  39  36
              3   101      125  36  40  38
              ...

                                       1
              Column names and row names  can also be set or retrieved with the functions
           colnames   and rownames  , respectively. For instance, the following sequence of
           commands assigns row names to meteo   corresponding to the names of the places
           where the meteorological data was collected (see Figure 2.1):

              > r <- c(“V. Castelo”, “Braga”, “S. Tirso”,
            “Montalegre”, “Bragança”, “Mirandela”, “M.  Douro”,
            “Régua”, “Viseu”, “Guarda”, “Coimbra”, “C. Branco”,
            “Pombal”,     “Santarém”,      “Dois     Portos”,     “Setúbal”,
            “Portalegre”, “Elvas”, “Évora”, “A. Sal”, “Beja”,
            “Amareleja”, “Alportel”, “Monchique”, “Tavira”);
              > rownames(meteo) <- r
              > meteo
                          PMax RainDays T80 T81 T82
              V. Castelo   181      143  36  39  37
              Braga        114      132  35  39  36
              S. Tirso     101      125  36  40  38
              Montalegre    80      111  34  33  31
              Bragança      36      102  37  36  35
              Mirandela     24       98  40  40  38
              M. Douro      39       96  37  37  35
              Régua         31      109  41  41  40
              ...


           2.1.2 Operating with the Data

           After having read in a data set, one is often confronted with the need of defining
           new  variables, according to a certain formula. Sometimes one also needs to
           manage the data in specific ways; for  instance, sorting cases according to the
           values of one or more variables, or transposing the data, i.e., exchanging the roles
           of columns and rows. In this section, we will present only the fundamentals of such
           operations, illustrated for the meteorological dataset. We further assume that we


           1
             Column or row names should preferably not use reserved R words.
   50   51   52   53   54   55   56   57   58   59   60