Page 106 - Statistics for Dummies
P. 106

90
                                         Part II: Number-Crunching Basics
                                                    To find Q  and Q  you use the steps shown in the section “Calculating per-
                                                                  3
                                                            1
                                                    centiles,” with n = 25. Step 1 is done because the data are ordered. For Step
                                                    2, since Q  is the 25th percentile, multiply 0.25 ∗ 25 = 6.25. This is not a whole
                                                            1
                                                    number, so Step 3a says to round it up to 7 and proceed to Step 3b.
                                                    Following Step 3b, you count from left to right in the data set until you reach
                                                    the 7th number, 68; this is Q . For Q  (the 75th percentile) you multiply 0.75
                                                                                  3
                                                                            1
                                                    ∗ 25 = 18.75, which you round up to 19. The 19th number on the list is 89, so
                                                    that’s Q . Putting it all together, the five-number summary for these 25 test
                                                          3
                                                    scores is 43, 68, 77, 89, and 99. To best interpret a five-number summary, you
                                                    can use a boxplot; see Chapter 7 for details.
                                                    Exploring interquartile range
                                                    The purpose of the five-number summary is to give descriptive statistics for
                                                    center, variation, and relative standing all in one shot. The measure of center
                                                    in the five-number summary is the median, and the first quartile, median, and
                                                    third quartiles are measures of relative standing.
                                                    To obtain a measure of variation based on the five-number summary, you can
                                                    find what’s called the interquartile range (or IQR). The IQR equals Q  – Q  (that
                                                                                                             3   1
                                                    is, the 75th percentile minus the 25th percentile) and reflects the distance
                                                    taken up by the innermost 50% of the data. If the IQR is small, you know a lot
                                                    of data are close to the median. If the IQR is large, you know the data are more
                                                    spread out from the median. The IQR for the test scores data set is 89 – 68 =
                                                    21, which is fairly large, seeing as how test scores only go from 0 to 100.
                                                    The interquartile range is a much better measure of variation than the regular
                                                    range (maximum value minus minimum value; see the section “Being out of
                                                    range” earlier in this chapter). That’s because the interquartile range doesn’t
                                                    take outliers into account; it cuts them out of the data set by only focusing
                                                    on the distance within the middle 50 percent of the data (that is, between the
                                                    25th and 75th percentiles).
                                                   Descriptive statistics that are well chosen and used correctly can tell you a
                                                    great deal about a data set, such as where the center is located, how diverse
                                                    the data are, and where a good portion of the data lies. However, descriptive
                                                    statistics can’t tell you everything about the data, and in some cases they
                                                    can be misleading. Be on the lookout for situations where a different statistic
                                                    would be more appropriate (for example, the median describes center more
                                                    fairly than the mean when the data is skewed), and keep your eyes peeled for
                                                    situations where critical statistics are missing (for example, when a mean is
                                                    reported without a corresponding standard deviation).



                                                                                                                           3/25/11   8:17 PM
                             10_9780470911082-ch05.indd   90
                             10_9780470911082-ch05.indd   90                                                               3/25/11   8:17 PM
   101   102   103   104   105   106   107   108   109   110   111