Page 238 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 238
Exercises 219
5.7 Several previous Examples and Exercises assumed a normal distribution for the
variables being tested. Using the Lilliefors and Shapiro-Wilk tests, check this
assumption for variables used in:
a) Examples 3.6, 3.7, 4.1, 4.5, 4.13, 4.14 and 4.20.
b) Exercises 3.2, 3.8, 4.9, 4.12 and 4.13.
5.8 The Signal & Noise dataset contains amplitude values of a noisy signal for
consecutive time instants, and a “detection” variable indicating when the amplitude is
above a specified threshold, ∆. For ∆ = 1, compute the number of time instants between
successive detections and use the chi-square test to assess the goodness of fit of the
geometric, Poisson and Gamma distributions to the empirical inter-detection time. The
geometric, Poisson and Gamma distributions are described in Appendix B.
5.9 Consider the temperature data, T, of the Weather dataset (a 1 Dat ) and assume that
it is a valid sample of the yearly temperature at 12H00 in the respective locality.
Determine whether one can, with 95% confidence, accept the Beta distribution model
with p = q = 3 for the empirical distribution of T. The Beta distribution is described in
Appendix B.
5.10 Consider the ASTV measurement data sample of the FHR-Apga r dataset. Check the
following statements:
a) Variable ASTV cannot have a normal distribution.
b) The distribution of ASTV in hospital HUC can be well modelled by the normal
distribution.
c) The distribution of ASTV in hospital HSJ cannot be modelled by the normal
distribution.
d) If variable ASTV has a normal distribution in the three hospitals, HUC, HGSA
and HSJ, then ASTV has a normal distribution in the Portuguese population.
e) If variable ASTV has a non-normal distribution in one of the three hospitals,
HUC, HGSA and HSJ, then ASTV cannot be well modelled by a normal
distribution in the Portuguese population.
5.11 Some authors consider Yates’ correction overly conservative. Using the Freshmen
dataset (see Example 5.9), assess whether or not “the proportion of male students that
are ‘initiated’ is smaller than that of female students” with and without Yates’
correction and comment on the results.
5.12 Consider the “Commitment to quality improvement” and “Time dedicated to
improvement” variables of the Metal Firms’ dataset. Assume that they have binary
ranks: 1 if the score is below 3, and 0 otherwise. Can one accept the association of
these two variables with 95% confidence?
5.13 Redo the previous exercise using the original scores. Can one use the chi-square
statistic in this case?
5.14 Consider the data describing the number of students passing (SCORE ≥ 10) or flunking
(SCORE < 10) the Programming examination in the Program ming dataset. Assess
whether or not one can be 95% confident that the pass/flunk variable is independent of
previous knowledge in Programming (variable PROG). Also assess whether or not the