Page 125 - Statistics for Dummies
P. 125
Chapter 7: Going by the Numbers: Graphing Numerical Data
percentages from the last two bars in the histogram). The last three bars
are what make the data have a shape that is skewed right.
Measuring center: Mean versus median
A histogram gives you a rough idea of where the “center” of the data lies. The
word center is in quotes because many different statistics are used to desig-
nate center. The two most common measures of center are the average (the
mean) and the median. (For details on measures of center, see Chapter 5.)
To visualize the average age (the mean), picture the data as people sitting on a
teeter-totter. Your objective is to balance it. Because data don’t move around,
assume the people stay where they are and you move the pivot point (which
you can also think of as the hinge or fulcrum) anywhere you want. The mean
is the place the pivot point has to be in order to balance the weight on each
side of the teeter-totter.
The balancing point of the teeter-totter is affected by the weights of the
people on each side, not by the number of people on each side. So the mean 109
is affected by the actual values of the data, rather than the amount of data.
The median is the place where you put the pivot point so you have an
equal number of people on each side of the teeter-totter, regardless of their
weights. With the same number of people on each side, the teeter-totter
wouldn’t balance in terms of weight unless the teeter-totter had people with
the same total weight on each side. So the median isn’t affected by the values
of the data, just their location within the data set.
The mean is affected by outliers, values in the data set that are away from the
rest of the data, on the high end and/or the low end. The median, being the
middle number, is not affected by outliers.
Viewing variability: Amount of spread around the mean
You also get a sense of variability in the data by looking at a histogram. For
example, if the data are all the same, they are all placed into a single bar,
and there is no variability. If an equal amount of data is in each group, the
histogram looks flat with the bars close to the same height; this means a fair
amount of variability.
The idea of a flat histogram indicating some variability may go against your
intuition, and if it does you’re not alone. If you’re thinking a flat histogram
means no variability, you’re probably thinking about a time chart, where
single numbers are plotted over time (see the section “Tackling Time Charts”
later in this chapter). Remember, though, that a histogram doesn’t show data
over time — it shows all the data at one point in time.
Equally confusing is the idea that a histogram with a big lump in the middle
and tails sloping sharply down on each side actually has less variability than
a histogram that’s straight across. The curves looking like hills in a histogram
3/25/11 8:16 PM
12_9780470911082-ch07.indd 109 3/25/11 8:16 PM
12_9780470911082-ch07.indd 109