Page 327 - Statistics for Dummies
P. 327
Chapter 19: Two-Way Tables and Independence
Comparing marginal and conditional to check for independence
Another way to check for independence is to see whether the marginal distri-
bution of voting pattern (overall) equals the conditional distribution of voting
pattern for each of the gender groups (males and females). If these distribu-
tions are equal, then gender doesn’t matter. Again, gender and voting pattern
are independent.
Looking at the voting pattern example, you find the conditional distribution of
voting pattern for the males (first bar in Figure 19-4) is 40% yes and 60% no. To
find the marginal (overall) distribution of voting pattern (males and females
together), take the marginal column totals in the last row of Table 19-8 (80 yes
and 120 no) and divide through by 200 (the grand total). You get 80 ÷ 200 =
0.40 or 40% yes, and 120 ÷ 200 = 0.60 or 60% no. (See the section “Calculating
marginal distributions” earlier in this chapter for more explanation.) The mar-
ginal distribution of overall voting pattern matches the conditional distribu-
tion of voting pattern for males, so voting pattern is independent of gender.
Here’s where a small table with only two rows and two columns cuts you 311
a break. You have to compare only one of the conditionals to the marginal
because you have only two groups to compare. If the voting pattern for the
males is the same as the overall voting pattern, then the same will be true
for the females. To check for independence when you have more than two
groups, you use a Chi-square test (discussed in my book Statistics II For
Dummies, published by Wiley).
Describing a dependent relationship
Two categorical variables are dependent if the conditional distributions
are different for at least two of the groups being compared. In the election
example from the previous section, the groups are males and females, and
the variable being compared is whether the person voted for the incumbent
president.
Dependence in this case means knowing that the outcome of the first vari-
able does affect the outcome of the second variable. In the election example,
if dependence had been found, it would mean that males and females didn’t
have the same voting pattern for the incumbent (for example, more males
voting for the incumbent than females). (Pollsters use this kind of data to
help steer their campaign strategies.)
Other ways of saying two variables are dependent are to say they are related,
or associated. However, statisticians don’t use the term correlation to indi-
cate relationships between categorical variables. The word correlation in this
context applies to the linear relationship between two numerical variables
(such as height and weight), as seen in Chapter 18. (This mistake occurs in the
media all the time, and it drives us statisticians crazy!)
3/25/11 8:13 PM
27_9780470911082-ch19.indd 311 3/25/11 8:13 PM
27_9780470911082-ch19.indd 311