Page 115 - Statistics and Data Analysis in Geology
P. 115

Analysis of Sequences of Data

             than the expected number of runs from a random arrangement; the null hypothesis
             and alternative are

                                              Hi: U>U
             and too many runs leads to rejection.  The test is one-tailed. Conversely, we may
             wish to determine  if the sequence contains an improbably low number of runs. The
             appropriate alternatives are
                                              Ho: Ur8
                                              H1: u<u

             and too few runs will cause rejection of  the null hypothesis. Again, the test is one-
             tailed. We  may wish to reject either form of  nonrandomness. A two-tailed test is
             appropriate, with hypotheses
                                              Ho: U=8
                                              Hi: Uf8
                 We  can work through the test procedure for the first series of  coin flips and
             determine the likelihood of  achieving this sequence by a random process. The null
             hypothesis states that there is no difference between the observed number of runs
             and the mean number of runs from random sequences of the same size. We will use
             a two-tailed test, and reject if there are too many or too few runs in the sequence.
             Therefore, the proper alternative is
                                              Hi: UfU

                 Using a 5% (a = 0.05) level of  significance, our critical regions are bounded by
             -1.96  and +1.96.  We first calculate the expected mean and standard deviation of
             runs for random sequences having nl  heads (nl = 11) and n2 tails (n2 = 9):






                                      2  11 *  9)(2 *  11  9 - 11 - 9)
                                   -
                                aiJ 2-(                           = 4.6
                                         (9 + 11)*(9 + 11 - 1)
             The test statistic is
                                      z=-  U-U   %   13- 10.9  = 1.0
                                           UU        2.1
             The number of  runs in the sequence is one standard deviation from the mean of
             all runs possible in such a sequence, and does not fall within the critical region.
             Therefore, the number of  runs does not suggest that the sequence is nonrandom.
             The other sequences, in contrast, yield very different test results. Because nl and
             nz are the same for all three sequences, 8 and (TU also are the same. For the second
             sequence, the test statistic is
                                              2 - 10.9
                                          z=          = -4.2
                                                2.1
             and for the third,
                                              19 - 10.9
                                          z=            = 3.9
                                                 2.1
                                                                                      187
   110   111   112   113   114   115   116   117   118   119   120