Page 125 - Statistics for Dummies
P. 125

Chapter 7: Going by the Numbers: Graphing Numerical Data
                                                    percentages from the last two bars in the histogram). The last three bars
                                                    are what make the data have a shape that is skewed right.
                                                    Measuring center: Mean versus median
                                                    A histogram gives you a rough idea of where the “center” of the data lies. The
                                                    word center is in quotes because many different statistics are used to desig-
                                                    nate center. The two most common measures of center are the average (the
                                                    mean) and the median. (For details on measures of center, see Chapter 5.)

                                                    To visualize the average age (the mean), picture the data as people sitting on a
                                                    teeter-totter. Your objective is to balance it. Because data don’t move around,
                                                    assume the people stay where they are and you move the pivot point (which
                                                    you can also think of as the hinge or fulcrum) anywhere you want. The mean
                                                    is the place the pivot point has to be in order to balance the weight on each
                                                    side of the teeter-totter.
                                                    The balancing point of the teeter-totter is affected by the weights of the
                                                    people on each side, not by the number of people on each side. So the mean   109
                                                    is affected by the actual values of the data, rather than the amount of data.
                                                    The median is the place where you put the pivot point so you have an
                                                    equal number of people on each side of the teeter-totter, regardless of their
                                                    weights. With the same number of people on each side, the teeter-totter
                                                    wouldn’t balance in terms of weight unless the teeter-totter had people with
                                                    the same total weight on each side. So the median isn’t affected by the values
                                                    of the data, just their location within the data set.
                                                    The mean is affected by outliers, values in the data set that are away from the
                                                    rest of the data, on the high end and/or the low end. The median, being the
                                                    middle number, is not affected by outliers.

                                                    Viewing variability: Amount of spread around the mean
                                                    You also get a sense of variability in the data by looking at a histogram. For
                                                    example, if the data are all the same, they are all placed into a single bar,
                                                    and there is no variability. If an equal amount of data is in each group, the
                                                    histogram looks flat with the bars close to the same height; this means a fair
                                                    amount of variability.
                                                    The idea of a flat histogram indicating some variability may go against your
                                                    intuition, and if it does you’re not alone. If you’re thinking a flat histogram
                                                    means no variability, you’re probably thinking about a time chart, where
                                                    single numbers are plotted over time (see the section “Tackling Time Charts”
                                                    later in this chapter). Remember, though, that a histogram doesn’t show data
                                                    over time — it shows all the data at one point in time.
                                                    Equally confusing is the idea that a histogram with a big lump in the middle
                                                    and tails sloping sharply down on each side actually has less variability than
                                                    a histogram that’s straight across. The curves looking like hills in a histogram






                                                                                                                           3/25/11   8:16 PM
                             12_9780470911082-ch07.indd   109                                                              3/25/11   8:16 PM
                             12_9780470911082-ch07.indd   109
   120   121   122   123   124   125   126   127   128   129   130