Page 122 - Statistics for Dummies
P. 122
106
Part II: Number-Crunching Basics
• If the histogram has very narrow ranges for its groups, it looks like
a big series of tiny bars that cloud the big picture. This can make
the data look very choppy with no real pattern.
✓ Make sure your groups have equal widths. If one bar is wider than the
others, it may contain more data than it should.
One idea that may be appropriate for your histogram is to take the range of
the data (largest minus smallest) and divide by 10 to get 10 groupings.
Handling borderline values
In the Academy Award example, what happens if an actress’s age lies right
on a borderline? For example, in Table 7-1 Norma Shearer was 30 years old in
1930 when she won the Oscar for The Divorcee. Does she belong in the 25–30
age group (the lower bar) or the 30–35 age group (the upper bar)?
As long as you are consistent with all the data points, you can either put all
the borderline points into their respective lower bars or put all of them into
their respective upper bars. The important thing is to pick a direction and be
consistent. In Figure 7-1, I went with the convention of putting all borderline
values into their respective upper bars — which puts Norma Shearer’s age in
the 3rd bar, the 30–35 age group of Figure 7-1.
Clarifying the axes
The most complex part of interpreting a histogram for the reader is to get
a handle on what’s being shown on the x and y axes. Having good descrip-
tive labels on the axes will help. Most statistical software packages label the
x-axis using the variable name you provided when you entered your data (for
example “age” or “weight”). However, the label for the y-axis isn’t as clear.
Statistical software packages often label the y-axis of a histogram by writing
“frequency” or “percent” by default. These terms can be confusing: frequency
or percentage of what?
Clarify the y-axis label on your histogram by changing “frequency” to “number
of” and adding the variable name. To modify a label that simply reads “percent,”
clarify by writing “percentage of” and the variable. For example, in the histo-
gram of ages of the Best Actress winners shown in Figure 7-1, I labeled the y-axis
“Percentage of actresses in each age group.” In the next section you see how to
interpret the results from a histogram. How old are those actresses anyway?
Interpreting a histogram
A histogram tells you three main features of numerical data:
✓ How the data are distributed among the groups (statisticians call this
the shape of the data)
3/25/11 8:16 PM
12_9780470911082-ch07.indd 106 3/25/11 8:16 PM
12_9780470911082-ch07.indd 106