Page 216 - Statistics for Dummies
P. 216
200
Part IV: Guesstimating and Hypothesizing with Confidence
When you need a high level of confidence, you have to increase the z*-value
and, hence, margin of error, resulting in a wider confidence interval, which
isn’t good. (See the previous section.) But you can offset this wider confidence
interval by increasing the sample size and bringing the margin of error back
down, thus narrowing the confidence interval.
The increase in sample size allows you to still have the confidence level you
want, but also ensures that the width of your confidence interval will be small
(which is what you ultimately want). You can even determine the sample
size you need before you start a study: If you know the margin of error you
want to get, you can set your sample size accordingly. (See the later section
“Figuring Out What Sample Size You Need” for more.)
When your statistic is going to be a percentage (such as the percentage of
people who prefer to wear sandals during summer), a rough way to figure
margin of error for a 95% confidence interval is to take 1 divided by the square
root of n (the sample size). You can try different values of n and you can see
how the margin of error is affected. For example, a survey of 100 people from
a large population will have a margin of error of about or plus
or minus 10% (meaning the width of the confidence interval is 20%, which is
pretty large).
However, if you survey 1,000 people, your margin of error decreases dramati-
cally, to plus or minus about 3%; the width now becomes only 6%. A survey
of 2,500 people results in a margin of error of plus or minus 2% (so the width
is down to 4%). That’s quite a small sample size to get so accurate, when you
think about how large the population is (the U.S. population, for example, is
over 310 million!).
Keep in mind, however, you don’t want to go too high with your sample size,
because a point comes where you have a diminished return. For example,
moving from a sample size of 2,500 to 5,000 narrows the width of the confi-
dence interval to about 2 ∗ 1.4 = 2.8%, down from 4%. Each time you survey
one more person, the cost of your survey increases, so adding another 2,500
people to the survey just to narrow the interval by little more than 1% may
not be worthwhile.
The first step in any data analysis problem (and when critiquing another per-
son’s results) is to make sure you have good data. Statistical results are only
as good as the data that went into them, so real accuracy depends on the
quality of the data as well as on the sample size. A large sample size that has
a great deal of bias (see Chapter 16) may appear to have a narrow confidence
interval — but means nothing. That’s like competing in an archery match and
shooting your arrows consistently, but finding out that the whole time you’re
shooting at the next person’s target; that’s how far off you are. With the field
of statistics, though, you can’t accurately measure bias; you can only try to
minimize it by designing good samples and studies (see Chapters 16 and 17).
3/25/11 8:14 PM
20_9780470911082-ch13.indd 200 3/25/11 8:14 PM
20_9780470911082-ch13.indd 200