Page 81 - Intermediate Statistics for Dummies
P. 81
07_045206 ch03.qxd 2/1/07 9:46 AM Page 60
60
Part I: Data Analysis and Model-Building Basics
Ho. Now, imagine that their running back makes a touchdown by pushing the
ball just barely over the goal line, so close that his team needs to have a ref-
eree review the film before calling it a touchdown. This situation is equivalent
to rejecting Ho with a p-value just below your prespecified value of α = 0.05.
In this case, the p-value is close to the borderline, say 0.045. But, if their team
makes a touchdown by catching a pass deep in the end zone, no one has
any doubt about the result because the ball was obviously past the goal line,
which is equivalent to the p-value being very small, say something like 0.001.
The opposing team’s showing a lot of evidence against Ho (and your team
could be in a lot of trouble).
Deconstructing Type I and Type II errors
Any technique you use in statistics to make a conclusion about a population
based on a sample of data has the chance of making an error. The errors I am
talking about, Type I and Type II errors, are due to random chance.
For example, you could flip a fair coin ten times and get all heads, making you
think that the coin isn’t fair at all. This thinking would result in an error,
because the coin actually was fair, but the data just wasn’t confirming that
due to chance. On the other hand, another coin may be unfair, and, just by
chance, you flip it ten times and get exactly five heads, which makes you
think that particular coin is equally balanced and doesn’t present any prob-
lem. (This tells you strange things can happen, especially when the sample
size is small.)
The way you set up your test can help to reduce these kinds of errors, but
they are always out there. As a data analyst, you need to know how to mea-
sure and understand the impact of the errors that can occur with a hypothe-
sis test and what you can do to possibly make those errors smaller. In the
following sections, I show you how you can do just that.
Making false alarms with Type I errors
A Type I error represents the situation where the coin was actually fair (using
the example from the preceding section), but your data led you to conclude
that it wasn’t, just by chance. I think of a Type I error as a false alarm: You
blew the whistle when you shouldn’t have.
To include a definition that makes all those stat experts happy, a Type I error
is the conditional probability of rejecting Ho, given that Ho is true.
The chance of making a Type I error is equal to α, which is predetermined
before you begin collecting your data. This α is the same α that represents
the chance of missing the boat in a confidence interval. It makes some sense