Page 77 - Intermediate Statistics for Dummies
P. 77

07_045206 ch03.qxd  2/1/07  9:46 AM  Page 56
                                56
                                         Part I: Data Analysis and Model-Building Basics
                                                    For example, say the standard deviation of the house prices from a previous
                                                    study is s = $15,000, and you want to be 95 percent confident in your estimate
                                                    of average house price. Using a large sample size, your value of t (from the
                                                    last row of Table A-1 in the Appendix) would be 1.96. With a sample of 100
                                                    homes, your margin of error would be plus or minus 1.96 times $15,000
                                                    divided by the square root of 100, which comes out to $2,940. If this is too
                                                    large for you but you still want 95 percent confidence, crank up your value of
                                                    n. If you sample 500 homes, the margin of error decreases to plus or minus
                                                    1.96 times $15,000 divided by the square root of 500, which brings you down
                                                    to $1,314.81.
                                                    You can actually use a formula to find the sample size you need to meet a
                                                                                                2
                                                                                              s
                                                    desired margin of error. That formula is n = d
                                                                                          t n 1-
                                                                                               n , where MOE is the
                                                                                          MOE
                                                    desired margin of error (as a proportion), s is the sample standard deviation,
                                                    and t is the value on the t-distribution that corresponds with the confidence
                                                    level you want. (You can use the last line of Table A-1 in the Appendix, which
                                                    will work fine, assuming that your sample size is fairly beyond 30.)
                                                    Interpreting a confidence interval
                                                    Interpreting a confidence interval involves a couple of subtle but important
                                                    issues, which I discuss in this section. The big idea is that a confidence inter-
                                                    val presents a range of likely values for the population parameter, based on
                                                    your sample. It includes this range because your sample results are going to
                                                    vary, and you want to address that. A 95 percent confidence interval, for
                                                    example, provides a range of likely values for the parameter such that the
                                                    parameter is included in the interval 95 percent of the time in the long term.
                                                    A 95 percent confidence interval doesn’t mean that your particular confi-
                                                    dence interval has a 95 percent chance of capturing the actual value of the
                                                    parameter; after the sample has been taken, it’s either in the interval or it
                                                    isn’t. A confidence interval represents the long-term chances of capturing the
                                                    actual value of the population parameter over many different samples.
                                                    Suppose a polling organization wants to estimate the percentage of people
                                                    in the United States who drive a car with more than 100,000 miles on it, and
                                                    it wants to be 95 percent confident in its results. The organization takes a
                                                    random sample of 1,200 people and finds that 420 of them (35 percent) drive
                                                    a much-driven car.
                                                    The meaty part of the interpretation lies in the confidence level — in this case,
                                                    the 95 percent. Because the organization took a sample of 1,200 people in the
                                                    U.S., asked each of them whether his or her car has more than 100,000 miles
                                                    on it and made a confidence interval out of it, the polling organization is, in
   72   73   74   75   76   77   78   79   80   81   82