Page 39 - Intermediate Statistics for Dummies
P. 39

05_045206 ch01.qxd  2/1/07  9:41 AM  Page 18
                                18
                                         Part I: Data Analysis and Model-Building Basics
                                                    Breaking the rules
                                                    According to the rules that all good statisticians live by, Ellen’s story should
                                                    end there. She may still be convinced that sugar indeed helps roses last
                                                    longer. She may use sugar with her roses for the rest of time and tell her
                                                    friends to use it too. But, she isn’t allowed to say that sugar water gives sta-
                                                    tistically different results than water alone; her analysis failed to show that.
                                                    But remember, Ellen’s last name is Go-getter, so she’s out to get those results.
                                                    She knows that nonparametric tests usually give more conservative results
                                                    than regular tests, and despite the fact that the conditions aren’t met, she
                                                    decides to analyze her data again, this time using the two-sample t-test.
                                                    Putting her data into a two-sample t-test takes only two more clicks of the
                                                    mouse, and Ellen’s results give her a p-value of 0.043. Using the usual signifi-
                                                    cance level used for hypothesis tests, 0.050, her p-value is less than this
                                                    number, so she can reject Ho. (In a two-sample t-test, Ho is that there’s no
                                                    difference in the means of the two groups. And her Ha in this case is that
                                                    the mean of the sugar group is larger than the mean of the control group.)
                                                    So Ellen gleefully cheers herself on for getting the results she wanted and
                                                    decides there’s no harm in trying a different analysis when all else fails.
                                                    Seeing the error of Ellen’s ways
                                                    But again, “Houston. . .” — you know the rest. Ellen’s problem is that she
                                                    cheated her way to getting a result that’s incorrect. She knew that the condi-
                                                    tions for the two-sample t-test weren’t met, but when the correct analysis
                                                    failed to get the results she wanted, she found an analysis that did. The trou-
                                                    ble is, the results of the two-sample t-test are bogus.
                                                    Now it may not be a life-and-death situation whether your roses actually do
                                                    last a little bit longer on sugar or not. (Incidentally, the gardening crowd says
                                                    they don’t, and that sugar in fact can encourage the growth of stem-clogging
                                                    bacteria so the flower can’t take in water.) But imagine a situation where doc-
                                                    tors are trying to test to see whether a certain medication helps people get
                                                    over an illness faster or whether some procedure helps cancer patients live
                                                    longer. Now you’re talking about results with a very serious impact.
                                                    Using the wrong data analysis for the sake of getting the results you desire
                                                    results in two major problems:
                                                       You mislead your audience into thinking that your hypothesis is actually
                                                        correct, which it may not be.
                                                       Sooner or later someone is going to try to replicate those results and
                                                        will find out that they can’t be replicated. This discovery will result in a
                                                        loss of your credibility big time. And unfortunately, you mislead many
                                                        people in the meantime.
   34   35   36   37   38   39   40   41   42   43   44