Page 222 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 222

5.3 Inference on Two Populations   203


              In order to assess these hypotheses, the Mann-Whitney test starts by assigning
           ranks to the samples. Let the samples be denoted x 1, x 2, …, x n and y 1, y 2, …, y m.
           The ranking of the x i and y i assigns ranks in 1, 2, …, n + m. As an example, let us
           consider the following situation:

              x i :    12   21   15   8
              y i :       9   13   19

              The ranking of x i and y i would then yield the result:

              Variable:   X Y  X Y  X Y  X
              Data:     8   9   12 13 15 19 21
              Rank:     1   2    3   4   5   6   7

              The test statistic is the sum of the ranks for one of the variables, say X:

              W X  = ∑ n = i 1 R( x ) ,                                    5.31
                           i

           where  R(x i) are the ranks assigned to the  x i. For the example above,  W X = 16.
           Similarly, W Y = 12 with:

                        N (N  +  ) 1
              W  +W   =         ,   total sum of the ranks from 1 through N = n + m.
                X
                    Y
                           2

              The rationale for using W X as a test statistic is that under the null hypothesis,
           P(X  > Y ) = ½, one expects the ranks to be randomly distributed between the x i and
           y i, therefore  resulting  in approximately equal average ranks in each  of the two
           samples. For small samples, there are tables with the exact probabilities of W X. For
           large samples (say  m or  n  above  10), the sampling distribution of  W X rapidly
           approaches the normal distribution with the following parameters:

                    n (N  +  ) 1      nm (N  +  ) 1
                                  2
              µ W X  =  2  ;    σ W X  =  12   .                           5.32

              Therefore, for large samples, the following test statistic with standard normal
           distribution is used:

                  W ± 5   − µ
                       0
                        .
               *
              z =   X        W X  .                                        5.33
                       σ W X

              The 0.5 continuity correction factor is added  when  one wants to determine
           critical points in the left tail of the distribution, and subtracted to determine critical
           points in the right tail of the distribution.
              When compared with its parametric counterpart, the t test, the Mann-Whitney
           test has a high power-efficiency, of about 95.5%, for moderate to large n. In some
   217   218   219   220   221   222   223   224   225   226   227