Page 216 - Statistics for Dummies
P. 216

200
                                         Part IV: Guesstimating and Hypothesizing with Confidence

                                                    When you need a high level of confidence, you have to increase the z*-value
                                                    and, hence, margin of error, resulting in a wider confidence interval, which
                                                    isn’t good. (See the previous section.) But you can offset this wider confidence
                                                    interval by increasing the sample size and bringing the margin of error back
                                                    down, thus narrowing the confidence interval.
                                                    The increase in sample size allows you to still have the confidence level you
                                                    want, but also ensures that the width of your confidence interval will be small
                                                    (which is what you ultimately want). You can even determine the sample
                                                    size you need before you start a study: If you know the margin of error you
                                                    want to get, you can set your sample size accordingly. (See the later section
                                                    “Figuring Out What Sample Size You Need” for more.)
                                                    When your statistic is going to be a percentage (such as the percentage of
                                                    people who prefer to wear sandals during summer), a rough way to figure
                                                    margin of error for a 95% confidence interval is to take 1 divided by the square
                                                    root of n (the sample size). You can try different values of n and you can see
                                                    how the margin of error is affected. For example, a survey of 100 people from
                                                    a large population will have a margin of error of about    or plus
                                                    or minus 10% (meaning the width of the confidence interval is 20%, which is
                                                    pretty large).
                                                    However, if you survey 1,000 people, your margin of error decreases dramati-
                                                    cally, to plus or minus about 3%; the width now becomes only 6%. A survey
                                                    of 2,500 people results in a margin of error of plus or minus 2% (so the width
                                                    is down to 4%). That’s quite a small sample size to get so accurate, when you
                                                    think about how large the population is (the U.S. population, for example, is
                                                    over 310 million!).

                                                    Keep in mind, however, you don’t want to go too high with your sample size,
                                                    because a point comes where you have a diminished return. For example,
                                                    moving from a sample size of 2,500 to 5,000 narrows the width of the confi-
                                                    dence interval to about 2 ∗ 1.4 = 2.8%, down from 4%. Each time you survey
                                                    one more person, the cost of your survey increases, so adding another 2,500
                                                    people to the survey just to narrow the interval by little more than 1% may
                                                    not be worthwhile.

                                                    The first step in any data analysis problem (and when critiquing another per-
                                                    son’s results) is to make sure you have good data. Statistical results are only
                                                    as good as the data that went into them, so real accuracy depends on the
                                                    quality of the data as well as on the sample size. A large sample size that has
                                                    a great deal of bias (see Chapter 16) may appear to have a narrow confidence
                                                    interval — but means nothing. That’s like competing in an archery match and
                                                    shooting your arrows consistently, but finding out that the whole time you’re
                                                    shooting at the next person’s target; that’s how far off you are. With the field
                                                    of statistics, though, you can’t accurately measure bias; you can only try to
                                                    minimize it by designing good samples and studies (see Chapters 16 and 17).









                                                                                                                           3/25/11   8:14 PM
                             20_9780470911082-ch13.indd   200                                                              3/25/11   8:14 PM
                             20_9780470911082-ch13.indd   200
   211   212   213   214   215   216   217   218   219   220   221