Page 112 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 112
3.2 Estimating a Mean 91
Figure 3.5. Selection of cases: a) Partial view of STATISTICA “Case Selection
Conditions” window; b) Partial view of SPSS “Select Cases” window.
In MATLAB one may select a submatrix of matrix x based on a particular
value, a , of a column i using the construction x(x(:,i)==a,:) . For instance,
assuming the first column of cork contains the classifications of the cork
stoppers, c = cork(cork(:,1)==1,:) will retrieve the submatrix of cork
corresponding to the first 50 cases of class 1. Other relational operators can be used
”
”
instead of the equality operator “== . (Attention: “= is an assignment operator,
an equality operator.) For instance, c = cork(cork(:,1)<2,:) will have the
same effect.
The selection of cases in R is usually based on the construction x[col ==
a,] , which selects the submatrix whose column col is equal to a certain value a .
For instance, cork[CL == 1,] selects the first 50 cases of class 1 of the data
frame cork . As in MATLAB other relational operators can be used instead of the
equality operator “== . ”
Selection of random subsets in MATLAB and R can be performed through the
generation of filter variables using random number generators. An example is
shown in Table 3.1. First, a filter variable with 150 random 0s and 1s is created by
rounding random numbers with uniform distribution in [0,1]. Next, the filter
variable is used to select a subset of the 150 cases of the cork data.
Table 3.1. Selecting a random subset of the cork stoppers dataset.
’
MATLAB >> filter = round(unifrnd(0,1,150,1));
>> fcork = cork(filter==1,:);
R > filter <- round(runif(150,0,1))
> fcork <- cork[filter==1,]