Page 238 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 238

Exercises  219


           5.7  Several previous Examples and Exercises assumed a normal  distribution for the
               variables being tested. Using the Lilliefors and Shapiro-Wilk tests, check this
               assumption for variables used in:
               a)  Examples 3.6, 3.7, 4.1, 4.5, 4.13, 4.14 and 4.20.
               b)  Exercises  3.2, 3.8, 4.9, 4.12 and 4.13.

           5.8 The  Signal &  Noise   dataset contains  amplitude values  of a noisy signal for
               consecutive time instants, and a “detection” variable indicating when the amplitude is
               above a specified threshold, ∆. For ∆ = 1, compute the number of time instants between
               successive detections and use the chi-square test to assess the goodness of fit of the
               geometric, Poisson and Gamma distributions to the empirical inter-detection time. The
               geometric, Poisson and Gamma distributions are described in Appendix B.

           5.9  Consider the temperature data, T, of the Weather   dataset (a 1 Dat  ) and assume that
               it is  a valid sample of the  yearly temperature at 12H00  in the respective locality.
               Determine whether one can, with 95% confidence, accept the Beta distribution model
               with p = q = 3 for the empirical distribution of T. The Beta distribution is described in
               Appendix B.

           5.10 Consider the ASTV measurement data sample of the FHR-Apga r   dataset. Check the
               following statements:
               a)  Variable ASTV cannot have a normal distribution.
               b)  The distribution of ASTV in hospital HUC can be well modelled by the normal
                   distribution.
               c)  The distribution of ASTV in hospital  HSJ cannot be modelled by the normal
                   distribution.
               d)  If variable ASTV has a normal distribution in the three hospitals, HUC, HGSA
                   and HSJ, then ASTV has a normal distribution in the Portuguese population.
               e)  If variable ASTV has a non-normal distribution in one of the three hospitals,
                   HUC, HGSA and HSJ, then  ASTV cannot be well modelled by a normal
                   distribution in the Portuguese population.

           5.11 Some authors consider Yates’ correction overly conservative. Using the Freshmen
               dataset (see Example 5.9), assess whether or not “the proportion of male students that
               are ‘initiated’  is smaller than  that of  female students”  with  and without Yates’
               correction and comment on the results.

           5.12 Consider the “Commitment to quality improvement” and “Time dedicated to
               improvement” variables of the Metal Firms’ dataset. Assume that they have binary
               ranks: 1 if the score is below 3, and 0 otherwise. Can one accept the association of
               these two variables with 95% confidence?

           5.13 Redo the previous exercise using the original scores. Can one use the chi-square
               statistic in this case?

           5.14 Consider the data describing the number of students passing (SCORE ≥ 10) or flunking
               (SCORE < 10) the Programming examination in the Program ming   dataset. Assess
               whether or not one can be 95% confident that the pass/flunk variable is independent of
               previous knowledge in Programming (variable PROG). Also assess whether or not the
   233   234   235   236   237   238   239   240   241   242   243