Page 240 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 240

Exercises  221


           5.23 Run the non-parametric counterparts of the tests used in Exercises 4.9, 4.10 and 4.20.
               Compare the results and the power of the tests with those obtained using parametric
               tests.

           5.24 Using appropriate non-parametric tests, determine which variables of the  Wines’
               dataset are most discriminative of the white from the red wines.

           5.25 The Neonatal   dataset contains mortality data for delivery taking place at home (MH)
               and at a Health Centre (MI). Assess whether there are significant differences at 5%
               level between delivery conditions, using the sign and the Wilcoxon tests.

           5.26 Consider  the  Firms’ dataset  containing productivity  figures (P) for a sample of
               Portuguese firms in four branches of activity (BRANCH). Study the dataset in order to:
               a)  Assess with 5% level of significance whether there are significant differences
                   among the productivity medians of the four branches.
               b)  Assess with 1% level of significance whether Commerce and Industry have
                   significantly different medians.

           5.27 Apply the appropriate non-parametric test in order to rank the discriminative capability
               of the features used to characterise the tissue types in the  Breast  Tissue   dataset.

           5.28 Redo the previous Exercise 5.27 for the CTG   dataset and the three-class discrimination
               expressed by the grouping variable NSP.

           5.29 Consider the discrimination of the three  clay types based on the sample data of the
               Clays’ dataset. Show that the null hypothesis of equal medians for the three clay
               types is:
               a)  Rejected with more than 95% confidence for all grading variables (LG, MG, HG).
               b)  Not rejected for the iron oxide features.
               c)  Rejected with higher confidence for the lime (CaO) than for the silica (SiO 2 ).

           5.30 The FHR dataset contains measurements of basal heart rate performed by three human
               experts and an  automatic diagnostic system. Assess whether the null hypothesis of
               equal median measurements can be accepted with 5% significance for the three human
               experts and the automatic diagnostic system.

           5.31 When analysing the contents of questions Q4, Q5, Q6 and Q7, someone said that “these
               questions are essentially evaluating the same thing”. Assess whether this statement can
               be accepted at a 5% significance level. Compute the coefficient of agreement κ and
               discuss its significance.

           5.32 The  Programming   dataset  contains results of  an enquiry regarding freshman
               previous knowledge on programming (PROG), Boole’s Algebra (AB),  binary
               arithmetic (BA) and computer hardware (H). Consider the variables PROG, AB, BA
               and H dichotomised in a “yes/no” fashion. Can one reject with 99% confidence the
               hypothesis that the four dichotomised variables essentially evaluate the same thing?
   235   236   237   238   239   240   241   242   243   244   245