Page 327 - Statistics for Dummies
P. 327

Chapter 19: Two-Way Tables and Independence
                                                    Comparing marginal and conditional to check for independence
                                                    Another way to check for independence is to see whether the marginal distri-
                                                    bution of voting pattern (overall) equals the conditional distribution of voting
                                                    pattern for each of the gender groups (males and females). If these distribu-
                                                    tions are equal, then gender doesn’t matter. Again, gender and voting pattern
                                                    are independent.
                                                    Looking at the voting pattern example, you find the conditional distribution of
                                                    voting pattern for the males (first bar in Figure 19-4) is 40% yes and 60% no. To
                                                    find the marginal (overall) distribution of voting pattern (males and females
                                                    together), take the marginal column totals in the last row of Table 19-8 (80 yes
                                                    and 120 no) and divide through by 200 (the grand total). You get 80 ÷ 200 =
                                                    0.40 or 40% yes, and 120 ÷ 200 = 0.60 or 60% no. (See the section “Calculating
                                                    marginal distributions” earlier in this chapter for more explanation.) The mar-
                                                    ginal distribution of overall voting pattern matches the conditional distribu-
                                                    tion of voting pattern for males, so voting pattern is independent of gender.
                                                    Here’s where a small table with only two rows and two columns cuts you   311
                                                    a break. You have to compare only one of the conditionals to the marginal
                                                    because you have only two groups to compare. If the voting pattern for the
                                                    males is the same as the overall voting pattern, then the same will be true
                                                    for the females. To check for independence when you have more than two
                                                    groups, you use a Chi-square test (discussed in my book Statistics II For
                                                    Dummies, published by Wiley).
                                                    Describing a dependent relationship
                                                    Two categorical variables are dependent if the conditional distributions
                                                    are different for at least two of the groups being compared. In the election
                                                    example from the previous section, the groups are males and females, and
                                                    the variable being compared is whether the person voted for the incumbent
                                                    president.
                                                    Dependence in this case means knowing that the outcome of the first vari-
                                                    able does affect the outcome of the second variable. In the election example,
                                                    if dependence had been found, it would mean that males and females didn’t
                                                    have the same voting pattern for the incumbent (for example, more males
                                                    voting for the incumbent than females). (Pollsters use this kind of data to
                                                    help steer their campaign strategies.)
                                                   Other ways of saying two variables are dependent are to say they are related,
                                                    or associated. However, statisticians don’t use the term correlation to indi-
                                                    cate relationships between categorical variables. The word correlation in this
                                                    context applies to the linear relationship between two numerical variables
                                                    (such as height and weight), as seen in Chapter 18. (This mistake occurs in the
                                                    media all the time, and it drives us statisticians crazy!)








                                                                                                                           3/25/11   8:13 PM
                             27_9780470911082-ch19.indd   311                                                              3/25/11   8:13 PM
                             27_9780470911082-ch19.indd   311
   322   323   324   325   326   327   328   329   330   331   332