Page 200 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 200

5.1 Inference on One Population   181


           Does the  Fr eshmen   dataset confirm that  belief for the  Porto Engineering
           College?
           A: We use the categories of answers obtained for Question 6, “I felt compelled to
           participate in  the Initiation”, of the  freshmen dataset (see Appendix E). The
           respective EXCEL file contains the computations of the frequencies of occurrence
           of each category and  for each question,  assuming a specified threshold for the
           average results in the examinations. Using, for instance, the threshold = 10, we see
           that there are 102 “best” students, with average examination score not less than the
           threshold. From these 102, there are  varied counts for the five categories of
           Question 6, ranging from 16 students that “fully disagree” to 5 students that “fully
           agree”.
              Under the null hypothesis, the answers to Question 6 have no relation with the
           freshmen performance and we would expect equal frequencies for all categories.
              The chi-square test results obtained with SPSS are shown in Table 5.6. Based on
           these results, we reject  the null hypothesis: there is evidence that the answer to
           Question  6 of the freshmen enquiry bears some relation with the student
           performance.

           Table 5.6. Dataset (a) and results (b), obtained with SPSS, for Question 6 of the
           freshmen enquiry and 102 students with average score ≥10.

               CAT Observed N  Expected N   Residual                     CAT
                 1       16        20.4        −4.4       Chi-Square    32.020
                 2       26        20.4          5.6
                 3       39        20.4        18.6       df                 4
                 4       16        20.4        −4.4
                 5       5         20.4      −15.4         Asymp. Sig.   0.000
             a                                         b


           Example 5.7

           Q: Consider the variable ART representing the total area of defects of the Cork
           Stoppers’ dataset, for the class 1 (Super)  of corks. Does the sample data
           provide evidence that this variable can be accepted as being normally distributed in
           that class?
           A: This example illustrates the application of the chi-square test for assessing the
           goodness of fit to a known distribution. In this case, the chi-square test uses the
           deviations  of the observed absolute frequencies vs. the expected absolute
           frequencies under the condition of the stated null hypothesis, i.e., that the variable
           ART is normally distributed.
              In  order to compute the absolute frequencies, we have to establish a  set of
           intervals based on the percentiles of the normal distribution. Since the number of
           cases is n = 50, and we want the conditions for using the chi-square distribution to
           be fulfilled, we use intervals corresponding to 20% of the cases. Table 5.7 shows
   195   196   197   198   199   200   201   202   203   204   205