Page 206 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 206

5.1 Inference on One Population   187


           5.1.5 The Lilliefors Test for Normality
           The Lilliefors test resembles the Kolmogorov-Smirnov but it is especially tailored
           to assess the normality of a distribution, with the null hypothesis formalised as:

              H 0:    (x =  N  µ ,σ (x ) .                                 5.14
                  F
                     )

              For this purpose, the test standardises the data using the sample estimates of µ
           and σ. Let Z represent the standardised data, i.e.,  z =  x (  i  −  x)  s / . The Lilliefors’
                                                      i
           test statistic is:

              D n = max | F(z) − S n(z) |.                                 5.15

              The test is, therefore, performed like the Kolmogorov-Smirnov test (see formula
           5.12), but with the advantage that the sampling distribution  of  D n takes into
           account the fact that the sample mean and sample standard deviation are used. The
           asymptotic critical points are:

              d  n . 01  =  . 1  031 /  n;  d n . 05  =  . 0  886  /  n;  d n . 10  =  . 0  805  /  n .  5.16
                0
                                                        ,
                                    0
                                                        0
                ,
                                    ,

              Critical values and extensive tables of the sampling distribution of D n can be
           found in the literature (see e.g. Conover, 1980).
              The Liliefors test can be performed with SPSS and STATISTICA as described
           in Commands 5.5. When applied to Example 5.8 it produces a lower bound for the
           significance ( p = 0.2), therefore not providing evidence allowing us to reject the
           null hypothesis.
           5.1.6 The Shapiro-Wilk Test for Normality
           The Shapiro-Wilk test is also tailored to assess the goodness of fit to the normal
           distribution. It is based on the observed distance between symmetrically positioned
           data values. Let us assume that the sample size is n and the successive values x 1,
           x 2,…, x n, were preliminarily sorted by increasing value:

              x 1 ≤ x 2 ≤ … ≤ x n.

              The distance of symmetrically positioned data values, around the middle value,
           is measured by:

              ( x n – i +1 − x i ),  for   i = 1, 2, ..., k,

           where k = (n + 1)/2 if n is odd and k = n/2 otherwise.
              The Shapiro-Wilk statistic is given by:

                    k            2  n
                                             2
              W  =   ∑ a ( x n −i +1  − x )   / ∑ x(  i  − x) .          5.17
                      i
                               i
                  i =1             = i 1
   201   202   203   204   205   206   207   208   209   210   211