Page 77 - Intermediate Statistics for Dummies

P. 77

07_045206 ch03.qxd 2/1/07 9:46 AM Page 56
56
Part I: Data Analysis and Model-Building Basics
For example, say the standard deviation of the house prices from a previous
study is s = $15,000, and you want to be 95 percent confident in your estimate
of average house price. Using a large sample size, your value of t (from the
last row of Table A-1 in the Appendix) would be 1.96. With a sample of 100
homes, your margin of error would be plus or minus 1.96 times $15,000
divided by the square root of 100, which comes out to $2,940. If this is too
large for you but you still want 95 percent confidence, crank up your value of
n. If you sample 500 homes, the margin of error decreases to plus or minus
1.96 times $15,000 divided by the square root of 500, which brings you down
to $1,314.81.
You can actually use a formula to find the sample size you need to meet a
2
s
desired margin of error. That formula is n = d
t n 1-
n , where MOE is the
MOE
desired margin of error (as a proportion), s is the sample standard deviation,
and t is the value on the t-distribution that corresponds with the confidence
level you want. (You can use the last line of Table A-1 in the Appendix, which
will work fine, assuming that your sample size is fairly beyond 30.)
Interpreting a confidence interval
Interpreting a confidence interval involves a couple of subtle but important
issues, which I discuss in this section. The big idea is that a confidence inter-
val presents a range of likely values for the population parameter, based on
your sample. It includes this range because your sample results are going to
vary, and you want to address that. A 95 percent confidence interval, for
example, provides a range of likely values for the parameter such that the
parameter is included in the interval 95 percent of the time in the long term.
A 95 percent confidence interval doesn’t mean that your particular confi-
dence interval has a 95 percent chance of capturing the actual value of the
parameter; after the sample has been taken, it’s either in the interval or it
isn’t. A confidence interval represents the long-term chances of capturing the
actual value of the population parameter over many different samples.
Suppose a polling organization wants to estimate the percentage of people
in the United States who drive a car with more than 100,000 miles on it, and
it wants to be 95 percent confident in its results. The organization takes a
random sample of 1,200 people and finds that 420 of them (35 percent) drive
a much-driven car.
The meaty part of the interpretation lies in the confidence level — in this case,
the 95 percent. Because the organization took a sample of 1,200 people in the
U.S., asked each of them whether his or her car has more than 100,000 miles
on it and made a confidence interval out of it, the polling organization is, in

72 73 74 75 76 77 78 79 80 81 82