Page 81 - Intermediate Statistics for Dummies
P. 81

07_045206 ch03.qxd  2/1/07  9:46 AM  Page 60
                                60
                                         Part I: Data Analysis and Model-Building Basics
                                                    Ho. Now, imagine that their running back makes a touchdown by pushing the
                                                    ball just barely over the goal line, so close that his team needs to have a ref-
                                                    eree review the film before calling it a touchdown. This situation is equivalent
                                                    to rejecting Ho with a p-value just below your prespecified value of α = 0.05.
                                                    In this case, the p-value is close to the borderline, say 0.045. But, if their team
                                                    makes a touchdown by catching a pass deep in the end zone, no one has
                                                    any doubt about the result because the ball was obviously past the goal line,
                                                    which is equivalent to the p-value being very small, say something like 0.001.
                                                    The opposing team’s showing a lot of evidence against Ho (and your team
                                                    could be in a lot of trouble).
                                                    Deconstructing Type I and Type II errors
                                                    Any technique you use in statistics to make a conclusion about a population
                                                    based on a sample of data has the chance of making an error. The errors I am
                                                    talking about, Type I and Type II errors, are due to random chance.
                                                    For example, you could flip a fair coin ten times and get all heads, making you
                                                    think that the coin isn’t fair at all. This thinking would result in an error,
                                                    because the coin actually was fair, but the data just wasn’t confirming that
                                                    due to chance. On the other hand, another coin may be unfair, and, just by
                                                    chance, you flip it ten times and get exactly five heads, which makes you
                                                    think that particular coin is equally balanced and doesn’t present any prob-
                                                    lem. (This tells you strange things can happen, especially when the sample
                                                    size is small.)
                                                    The way you set up your test can help to reduce these kinds of errors, but
                                                    they are always out there. As a data analyst, you need to know how to mea-
                                                    sure and understand the impact of the errors that can occur with a hypothe-
                                                    sis test and what you can do to possibly make those errors smaller. In the
                                                    following sections, I show you how you can do just that.
                                                    Making false alarms with Type I errors
                                                    A Type I error represents the situation where the coin was actually fair (using
                                                    the example from the preceding section), but your data led you to conclude
                                                    that it wasn’t, just by chance. I think of a Type I error as a false alarm: You
                                                    blew the whistle when you shouldn’t have.
                                                    To include a definition that makes all those stat experts happy, a Type I error
                                                    is the conditional probability of rejecting Ho, given that Ho is true.
                                                    The chance of making a Type I error is equal to α, which is predetermined
                                                    before you begin collecting your data. This α is the same α that represents
                                                    the chance of missing the boat in a confidence interval. It makes some sense
   76   77   78   79   80   81   82   83   84   85   86