Page 129 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 129
108 3 Estimating Data Parameters
3.3 Compute the 95% confidence interval of the mean and of the standard deviation of the
RC variable of the previous exercise, for the samples constituted by the first 50 cases
and by the last 50 cases. Comment on the results.
3.4 Consider the ASTV and ALTV variables of the CTG dataset. Assume that only a
15-case random sample is available for these variables. Can one expect to obtain
reliable estimates of the 95% confidence interval of the mean of these variables using
the Student’s t distribution applied to those samples? Why? (Inspect the variable
histograms.)
3.5 Obtain a 15-case random sample of the ALTV variable of the previous exercise (see
Commands 3.2). Compute the respective 95% confidence interval assuming a normal
and an exponential fit to the data and compare the results. The exponential fit can be
performed in MATLAB with the function it expf .
3.6 Compute the 90% confidence interval of the ASTV and ALTV variables of the
previous Exercise 3.4 for 10 random samples of 20 cases and determine how many
times the confidence interval contains the mean value determined for the whole 2126
case set. In a long run of these 20-case experiments, which variable is expected to yield
a higher percentage of intervals containing the whole-set mean?
3.7 Compute the mean with the 95% confidence interval of variable ART of the Cork
Stoppers dataset. Perform the same calculations on variable LOGART = ln(ART).
Apply the Gauss’ approximation formula of A.6.1 in order to compare the results.
Which point estimates and confidence intervals are more reliable? Why?
3.8 Consider the PERIM variable of the Breast Tissue dataset. What is the tolerance
of the PERIM mean with 95% confidence for the carcinoma class? How many cases of
the carcinoma class should one have available in order to reduce that tolerance to 2%?
3.9 Imagine that when analysing the TW=“Team Work” variable of the Metal Firms
dataset, someone stated that the team-work is at least good (score 4) for 3/8 = 37.5% of
the metallurgic firms. Does this statement deserve any credit? (Compute the 95%
confidence interval of this estimate.)
3.10 Consider the C ulture dataset. Determine the 95% confidence interval of the
proportion of boroughs spending more than 20% of the budget for musical activities.
3.11 Using the CTG dataset, determine the percentage of foetal heart rate cases that have
abnormal short term variability of the heart rate more than 50% of the time, during
calm sleep (CLASS A). Also, determine the 95% confidence interval of that percentage
and how many cases should be available in order to obtain an interval estimate with 1%
tolerance.
3.12 A proportion p ˆ was estimated in 225 cases. What are the approximate worst-case 95%
confidence interval limits of the proportion?
3.13 Redo Exercises 3.2 and 3.3 for the 99% confidence interval of the standard deviation.