Page 318 - Statistics for Dummies
P. 318
302
Part V: Statistical Studies and the Hunt for a Meaningful Relationship
From the results of the two separate marginal distributions for the pet camp-
ing and opinion variables, you say that the majority of all the campers in this
sample are non–pet campers (70%) and the majority of all the campers in this
sample (75%) support the idea of having a pet section.
While marginal distributions show us how each variable breaks down on its
own, they don’t tell us about the connection between two variables. For the
camping example, you know what percentage of all campers support a new
pet section, but you can’t distinguish the opinions of the pet campers from the
non–pet campers. Distributions for making such comparisons are found in the
later section, “Comparing groups with conditional distributions.”
Examining all groups —
a joint distribution
Story time: A certain auto manufacturer conducted a survey to see what char-
acteristics customers prefer in their small pickup trucks. They found that the
most popular color for these trucks was red and the most popular option was
four-wheel drive. In response to these results, the company started making
more of their small pickup trucks red with four-wheel drive.
Guess what? They struck out; people weren’t buying those trucks. Turns out
that the customers who bought the red trucks were more likely to be women,
and women didn’t use four-wheel drive as often as men did. Customers who
bought the four-wheel drive trucks were more likely to be men, and they
tended to prefer black ones over red ones. So the most popular outcome of
the first variable (color) paired with the most popular outcome of the second
variable (options on the vehicle) doesn’t necessarily add up to the most
popular combination of the two variables.
To figure out which combination of two categorical variables contains the
highest proportion, you need to compare the cell proportions (for example,
the color and vehicle options together) rather than the marginal propor-
tions (the color and vehicle option separately). The joint distribution of both
variables in a two-way table is a listing of all possible row and column com-
binations and the proportion of individuals within each group. You use it to
answer questions involving two characteristics; such as “What proportion of
the voters are Democrat and female?” or, “What percentage of the campers
are pet campers who support a pet section?” In the following sections, I show
you how to calculate and graph joint distributions.
Calculating joint distributions
A joint distribution shows the proportion of the data that lies in each cell of
the two-way table. For the pet camping example, the four row-column combi-
nations are:
3/25/11 8:13 PM
27_9780470911082-ch19.indd 302
27_9780470911082-ch19.indd 302 3/25/11 8:13 PM