Page 39 - Intermediate Statistics for Dummies
P. 39
05_045206 ch01.qxd 2/1/07 9:41 AM Page 18
18
Part I: Data Analysis and Model-Building Basics
Breaking the rules
According to the rules that all good statisticians live by, Ellen’s story should
end there. She may still be convinced that sugar indeed helps roses last
longer. She may use sugar with her roses for the rest of time and tell her
friends to use it too. But, she isn’t allowed to say that sugar water gives sta-
tistically different results than water alone; her analysis failed to show that.
But remember, Ellen’s last name is Go-getter, so she’s out to get those results.
She knows that nonparametric tests usually give more conservative results
than regular tests, and despite the fact that the conditions aren’t met, she
decides to analyze her data again, this time using the two-sample t-test.
Putting her data into a two-sample t-test takes only two more clicks of the
mouse, and Ellen’s results give her a p-value of 0.043. Using the usual signifi-
cance level used for hypothesis tests, 0.050, her p-value is less than this
number, so she can reject Ho. (In a two-sample t-test, Ho is that there’s no
difference in the means of the two groups. And her Ha in this case is that
the mean of the sugar group is larger than the mean of the control group.)
So Ellen gleefully cheers herself on for getting the results she wanted and
decides there’s no harm in trying a different analysis when all else fails.
Seeing the error of Ellen’s ways
But again, “Houston. . .” — you know the rest. Ellen’s problem is that she
cheated her way to getting a result that’s incorrect. She knew that the condi-
tions for the two-sample t-test weren’t met, but when the correct analysis
failed to get the results she wanted, she found an analysis that did. The trou-
ble is, the results of the two-sample t-test are bogus.
Now it may not be a life-and-death situation whether your roses actually do
last a little bit longer on sugar or not. (Incidentally, the gardening crowd says
they don’t, and that sugar in fact can encourage the growth of stem-clogging
bacteria so the flower can’t take in water.) But imagine a situation where doc-
tors are trying to test to see whether a certain medication helps people get
over an illness faster or whether some procedure helps cancer patients live
longer. Now you’re talking about results with a very serious impact.
Using the wrong data analysis for the sake of getting the results you desire
results in two major problems:
You mislead your audience into thinking that your hypothesis is actually
correct, which it may not be.
Sooner or later someone is going to try to replicate those results and
will find out that they can’t be replicated. This discovery will result in a
loss of your credibility big time. And unfortunately, you mislead many
people in the meantime.