Page 129 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 129

108      3 Estimating Data Parameters


           3.3  Compute the 95% confidence interval of the mean and of the standard deviation of the
               RC variable of the previous exercise, for the samples constituted by the first 50 cases
               and by the last 50 cases. Comment on the results.

           3.4  Consider the ASTV and ALTV variables of the  CTG   dataset. Assume that only a
               15-case random sample is available for these  variables. Can  one expect to obtain
               reliable estimates of the 95% confidence interval of the mean of these variables using
               the Student’s  t distribution applied to those samples? Why? (Inspect the variable
               histograms.)

           3.5  Obtain a 15-case random sample of the ALTV variable of the previous exercise (see
               Commands 3.2). Compute the respective 95% confidence interval assuming a normal
               and an exponential fit to the data and compare the results. The exponential fit can be
               performed in MATLAB with the function  it expf  .

           3.6  Compute the 90% confidence  interval of the  ASTV and ALTV variables of the
               previous Exercise 3.4 for 10 random samples of 20 cases and  determine how many
               times the confidence interval contains the mean value determined for the whole 2126
               case set. In a long run of these 20-case experiments, which variable is expected to yield
               a higher percentage of intervals containing the whole-set mean?

           3.7  Compute the mean with the 95% confidence interval of variable ART of the Cork
               Stoppers dataset. Perform the same calculations on variable LOGART = ln(ART).
               Apply  the Gauss’ approximation formula of  A.6.1  in order  to compare the results.
               Which point estimates and confidence intervals are more reliable? Why?

           3.8  Consider the PERIM variable of the Breast Tissue   dataset. What is the tolerance
               of the PERIM mean with 95% confidence for the carcinoma class? How many cases of
               the carcinoma class should one have available in order to reduce that tolerance to 2%?

           3.9  Imagine that when analysing the TW=“Team Work” variable of the Metal Firms
               dataset, someone stated that the team-work is at least good (score 4) for 3/8 = 37.5% of
               the metallurgic  firms. Does this statement deserve any  credit? (Compute the 95%
               confidence interval of this estimate.)

           3.10 Consider  the  C ulture   dataset. Determine the 95% confidence interval of the
               proportion of boroughs spending more than 20% of the budget for musical activities.

           3.11 Using the CTG   dataset, determine the percentage of foetal heart rate cases that have
               abnormal short term variability of the heart rate more than 50% of the time, during
               calm sleep (CLASS A). Also, determine the 95% confidence interval of that percentage
               and how many cases should be available in order to obtain an interval estimate with 1%
               tolerance.

           3.12 A proportion  p ˆ was estimated in 225 cases. What are the approximate worst-case 95%
               confidence interval limits of the proportion?

           3.13 Redo Exercises 3.2 and 3.3 for the 99% confidence interval of the standard deviation.
   124   125   126   127   128   129   130   131   132   133   134