Page 191 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 191
172 5 Non-Parametric Tests of Hypotheses
5.1 Inference on One Population
5.1.1 The Runs Test
The runs test assesses whether or not a sequence of observations can be accepted
as a random sequence, that is, with independent successive observations. Note that
most tests of hypotheses do not care about the order of the observations. Consider,
for instance, the meteorological data used in Example 4.1. In this example, when
testing the mean based on a sample of maximum temperatures, the order of the
observations is immaterial. The maximum temperatures could be ordered by
increasing or decreasing order, or could be randomly shuffled, still giving us
exactly the same results.
Sometimes, however, when analysing sequences of observations, one has to
decide whether a given sequence of values can be assumed as exhibiting a random
behaviour.
Consider the following sequences of n = 12 trials of a dichotomous experiment,
as one could possibly obtain when tossing a coin:
Sequence 1: 0 0 0 0 0 0 1 1 1 1 1 1
Sequence 2: 0 1 0 1 0 1 0 1 0 1 0 1
Sequence 3: 0 0 1 0 1 1 1 0 1 1 0 0
Sequences 1 and 2 would be rejected as random since a dependency pattern is
1
clearly present . Such sequences raise a reasonable suspicion concerning either the
“fairness” of the coin-tossing experiment or the absence of some kind of data
manipulation (e.g. sorting) of the experimental results. Sequence 3, on the other
hand, seems a good candidate of a sequence with a random pattern.
The runs test analyses the randomness of a sequence of dichotomous trials. Note
that all the tests described in the previous chapter (and others to be described next
as well) are insensitive to data sorting. For instance, when testing the mean of the
three sequences above, with H 0: µ = 6/12 = ½, one obtains the same results.
The test procedure uses the values of the number of occurrences of each
category, say n 1 and n 2 for 1 and 0 respectively, and the number of runs, i.e., the
number of occurrences of an equal value subsequence delimited by a different
value. For sequence 3, the number of runs, r, is equal to 7, as seen below:
Sequence 3: 0 0 1 0 1 1 1 0 1 1 0 0
Runs: 1 2 3 4 5 6 7
1
Note that we are assessing the randomness of the sequence, not of the process that generated it.