Page 220 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 220

5.3 Inference on Two Populations   201


           5.3.1 Tests for Two Independent Samples


           Commands 5.8.  SPSS,  STATISTICA,  MATLAB and R commands  used to
           perform non-parametric tests on two independent samples.

             SPSS          Analyze; Nonparametric Tests;
                           2 Independent Samples

             STATISTICA    Statistics; Nonparametrics; Comparing two
                           independent samples (groups)
             MATLAB        [p,h,stats]=ranksum(x,y,alpha)


             R             ks.test(x,y) ;
                           wilcox.test(x,y) | wilcox.test(x~y)



           5.3.1.1  The Kolmogorov-Smirnov Two-Sample Test
           The Kolmogorov-Smirnov test is used to assess whether two independent samples
           were  drawn  from the same population  or from populations with the same
           distribution, for the variable X being tested, which is assumed to be continuous. Let
           F(x) and  G(x) represent the unknown distributions for the two independent
           samples. The null hypothesis is formalised as:

             H 0: Data variable X has equal cumulative probability distributions for the two
                 samples: F (x) = G(x).

              The test is conducted similarly to the way described in section 5.1.4. Let S m(x)
           and S n(x) represent the empirical distributions of the two samples, with sizes m and
           n, respectively.  We then  use as test statistic, the maximum deviation  of these
           empirical distributions:

              D m,n = max | S n(x) – S m(x) |.                             5.29

              For large samples (say, m and n above 25) and two-tailed tests (the most usual),
           the significance of D m,n can be evaluated using the critical values obtained with the
           expression:

                 m +  n
              c       ,                                                    5.30
                  mn

           where c is a coefficient that depends on the significance level, namely c = 1.36 for
           α  = 0.05 (for details, see e.g. Siegel S, Castellan Jr NJ, 1998).
              When compared with its  parametric counterpart, the  t test, the Kolmogorov-
           Smirnov test has a high power-efficiency of about 95%, even for small samples.
   215   216   217   218   219   220   221   222   223   224   225