Page 39 - Statistics II for Dummies

P. 39

Chapter 2: Finding the Right Analysis for the Job 23

The next step is to find a means to relate these numbers to each other in
an easy way. You can do this by using the relative frequency, which is the
percentage of data that falls into a specific category of a categorical vari-
able. You can find a category’s relative frequency by dividing the frequency
by the sample total and then multiplying by 100. In this case, you have
and

You can also express the relative frequency as a proportion in each group by
leaving the result in decimal form and not multiplying by 100. This statistic is
called the sample proportion. In this example, the sample proportion of males
is 0.36, and the sample proportion of females is 0.64.

You mainly summarize categorical variables by using two statistics — the
number in each category (frequency) and the percentage (relative frequency)
in each category.

Statistics for Categorical Variables

The types of statistics done on categorical data may seem limited; however,
the wide variety of analyses you can perform using frequencies and relative
frequencies offers answers to an extensive range of possible questions you
may want to explore.
In this section, you see that the proportion in each group is the number-one
statistic for summarizing categorical data. Beyond that, you see how you can
use proportions to estimate, compare, and look for relationships between the
groups that comprise the categorical data.

Estimating a proportion

You can use relative frequencies to make estimates about a single popula-
tion proportion. (Refer to the earlier section “Categorical versus Quantitative
Variables” for an explanation of relative frequencies.)

Suppose you want to know what proportion of females in the United States
are Democrats. According to a sample of 29,839 female voters in the U.S.
conducted by the Pew Research Foundation in 2003, the percentage of female
Democrats was 36. Now, because the Pew researchers based these results
on only a sample of the population and not on the entire population, their
results will vary if they take another sample. This variation in sample results
is cleverly called — you guessed it — sampling variability.

06_466469-ch02.indd 23 7/24/09 9:31:36 AM

34 35 36 37 38 39 40 41 42 43 44