Page 55 - Intermediate Statistics for Dummies
P. 55
06_045206 ch02.qxd 2/1/07 9:42 AM Page 34
34
Part I: Data Analysis and Model-Building Basics
Suppose you’ve collected data on a random sample of 1,000 United States
voters. You may want to compare the proportion of female voters to the pro-
portion of male voters and find out whether they’re equal. Suppose in your
sample you find that the proportion of females is 0.53, and the proportion of
males is 0.47. So for this sample of 1,000 people, you have a higher propor-
tion of females than males. But here’s the big question: Are these sample pro-
portions different enough to say that the entire population of U.S. voters has
more females in it than males? After all, sample results vary from sample to
sample. The answer to this question requires comparing the sample propor-
tions by using a hypothesis test for two proportions. I demonstrate and
expand on this technique in Chapter 3.
Estimating a proportion
You can also use relative frequencies (check out the section “Qualitative
versus Quantitative Variables in Statistical Analysis”) to make estimates
about a single population proportion.
Say, for example, you want to know what proportion of females in the United
States are Democrats. According to a sample of 29,839 female voters from the
U.S. conducted by the Pew Research Foundation in 2003, the percentage of
female Democrats was 36. Now because the Pew researchers based these
results on only a sample of the population and not on the entire population,
these results may vary from sample to sample. The amount of variability is
measured by the margin of error (the amount that you add and subtract from
your sample statistic), which for this sample is only about 0.5 percent. (To
find out how to calculate margin of error, explore Chapter 3.) That means that
the estimated percentage of female Democrats in the U.S. voting population is
estimated to be somewhere between 35.5 percent and 36.5 percent.
The margin of error, combined with the sample proportion, forms what statis-
ticians call a confidence interval for the population proportion. Recall from
intro stats that a confidence interval is a range of likely values for a popula-
tion parameter, formed by taking the sample statistic plus or minus the
margin of error. (For more on confidence intervals, see Chapter 3.)
Looking for relationships between
qualitative variables
Suppose you want to know whether two qualitative variables are related
(for example, is gender related to political affiliation?). Answering this ques-
tion requires putting the sample data into a two-way table (using rows and