Page 55 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 55
34 2 Presenting and Summarising the Data
It is often convenient to have appropriate column names for the data, instead of
the default V1 , V2 , etc. One way to do this is to first create a string vector and pass
it to the rea d.table function as a co l.names parameter value. For the
meteo data we could have:
> l <- c(“PMax”, RainDays”, T80”, T81”, T82”)
“
“
“
“
> meteo<-read.table(file(“e:meteo.txt”),col.names=l)
> meteo
PMax RainDays T80 T81 T82
1 181 143 36 39 37
2 114 132 35 39 36
3 101 125 36 40 38
...
1
Column names and row names can also be set or retrieved with the functions
colnames and rownames , respectively. For instance, the following sequence of
commands assigns row names to meteo corresponding to the names of the places
where the meteorological data was collected (see Figure 2.1):
> r <- c(“V. Castelo”, “Braga”, “S. Tirso”,
“Montalegre”, “Bragança”, “Mirandela”, “M. Douro”,
“Régua”, “Viseu”, “Guarda”, “Coimbra”, “C. Branco”,
“Pombal”, “Santarém”, “Dois Portos”, “Setúbal”,
“Portalegre”, “Elvas”, “Évora”, “A. Sal”, “Beja”,
“Amareleja”, “Alportel”, “Monchique”, “Tavira”);
> rownames(meteo) <- r
> meteo
PMax RainDays T80 T81 T82
V. Castelo 181 143 36 39 37
Braga 114 132 35 39 36
S. Tirso 101 125 36 40 38
Montalegre 80 111 34 33 31
Bragança 36 102 37 36 35
Mirandela 24 98 40 40 38
M. Douro 39 96 37 37 35
Régua 31 109 41 41 40
...
2.1.2 Operating with the Data
After having read in a data set, one is often confronted with the need of defining
new variables, according to a certain formula. Sometimes one also needs to
manage the data in specific ways; for instance, sorting cases according to the
values of one or more variables, or transposing the data, i.e., exchanging the roles
of columns and rows. In this section, we will present only the fundamentals of such
operations, illustrated for the meteorological dataset. We further assume that we
1
Column or row names should preferably not use reserved R words.