Page 42 - Intermediate Statistics for Dummies
P. 42

05_045206 ch01.qxd  2/1/07  9:41 AM  Page 21
                                             Chapter 1: Beyond Number Crunching: The Art and Science of Data Analysis
                                                    Hypothesis test
                                                    A hypothesis test is a statistical procedure that you use to test an existing
                                                    claim about the population, using your data. The claim is noted by Ho (the
                                                    null hypothesis). If your data support the claim, you fail to reject Ho. If your
                                                    data don’t support the claim, you reject Ho and conclude an alternative
                                                    hypothesis, Ha. The reason most people conduct a hypothesis test is not to
                                                    merely show that their data support an existing claim, but rather to show
                                                    that the existing claim is false, in favor of the alternative hypothesis.
                                                    The Pew Research Center studied the percentage of people who go to ESPN
                                                    for their sports news. Their statistics, based on a survey of about 1,000
                                                    people, found that in 2000, 23 percent of people said they go to ESPN; while in
                                                    2004, only 20 percent reported going to ESPN. The question is this: Does this
                                                    3-percent reduction in viewers from 2000 to 2004 represent a significant trend
                                                    that ESPN should worry about?
                                                    To test these differences formally, you can set up a hypothesis test. You set  21
                                                    up your null hypothesis as the result you have to believe without your study,
                                                    Ho = no difference exists between 2000 and 2004 data for ESPN viewership.
                                                    Your alternative hypothesis (Ha) is that a difference is there.
                                                    In very general terms, here’s what’s happening with a hypothesis test. You
                                                    have the sample data, and you find the statistics that are relevant. In this
                                                    case, you have two sample percentages, one for 2000 and one for 2004. You
                                                    take the difference between the two samples (3 percent), and divide it by the
                                                    standard error for the difference. The standard error measures how much the
                                                    difference in the statistics is expected to change from sample to sample. In
                                                    this case, the standard error comes to about 1.8 percent (for specific calcula-
                                                    tions see Chapter 3).
                                                    Taking the difference in the statistics (3 percent = 0.03) divided by the stan-
                                                    dard error (1.8 percent = 0.018) gives you the value of 1.67 (called the test
                                                    statistic). This value represents the difference between the two statistics, in
                                                    terms of number of standard errors. This result has a universal interpreta-
                                                    tion. Roughly speaking, if your test statistic falls between –2.00 and +2.00,
                                                    that means the results you found don’t differ enough to get excited about,
                                                    because 95 percent of the time, this outcome happens just by chance. (And
                                                    this example falls right into that situation.) After you take the variability of
                                                    the sample results into account, the difference in these particular samples
                                                    doesn’t transfer over to the populations they represent. So, because you
                                                    can’t reject Ho, you have to say the percentage of viewers of ESPN in the
                                                    entire population probably didn’t change from 2000 to 2004.
                                                    Because you have a 95 percent confidence level, this test uses a significance
                                                    level (α level) of 1 – 0.95 = 0.05 or 5 percent. This percentage measures how
                                                    likely your results would have been just by chance.
   37   38   39   40   41   42   43   44   45   46   47