Page 286 - Statistics II for Dummies
P. 286

270        Part IV: Building Strong Connections with Chi-Square Tests



                                Checking the conditions before you start


                                Every statistical technique seems to have a catch, and this case is no exception.
                                In order to use the Chi-square distribution to interpret your goodness-of-fit
                                statistic, you have to be sure you have enough information to work with in
                                each cell. The stats gurus usually recommend that the expected count for
                                each cell turns out to be greater than or equal to five. If it doesn’t, one option
                                is to combine categories to increase the numbers.
                                In the M&M’S example, the expected cell counts are all above seven (see
                                Table 15-3), so the conditions are met. If this weren’t the case, you should
                                have taken a larger sample size, because you calculate the expected cell
                                counts by taking the expected percentage in that cell times the sample size. If
                                you increase the sample size, you increase the expected cell count. A higher
                                sample size also increases your chances of detecting a real deviation from
                                the model. This idea is related to the power of the test (see Chapter 3 for
                                information on power).

                                After you collect your data, it’s not right to go back and take a new and larger
                                sample. It’s best to set up the appropriate sample size ahead of time, and you
                                can do this by determining what sample size you need to get the expected cell
                                counts to be at least five. For example, if you roll a fair die, you expect   of the
                                outcomes to be ones. If you only take a sample of six rolls, you have an
                                expected cell count of    , which isn’t enough. However, if you roll the die
                                30 times, your expected cell count is   , which is just enough to meet
                                the condition.


                                The steps of the Chi-square

                                goodness-of-fit test


                                Assuming the necessary condition is met (see the previous section), you can
                                get down to actually conducting a formal goodness-of-fit test.

                                The general version of the null hypothesis for the goodness-of-fit test is Ho:
                                The model holds for all categories; versus the alternative hypothesis Ha: The
                                model doesn’t hold for at least one category. Each situation will dictate what
                                proportions should be listed in Ho for each category. For example, if you’re
                                rolling a fair die, you have Ho: Proportion of ones =  ; proportion of
                                twos =  ; . . . ; proportion of sixes =  .













          22_466469-ch15.indd   270                                                                   7/24/09   9:52:21 AM
   281   282   283   284   285   286   287   288   289   290   291