Page 345 - Statistics for Dummies
P. 345

Chapter 20: Ten Tips for the Statistically Savvy Sleuth
                                                    Statistics are based on formulas that take the numbers you give them and

                                                    crunch out what you ask them to crunch out. The formulas don’t know whether
                                                    the final answers are correct or not. The people behind the formulas should
                                                    know better, of course. Those who don’t know better will make mistakes; those
                                                    who do know better might fudge the numbers anyway and hope you don’t catch
                                                    on. You, as a consumer of information (also known as a certified skeptic), must
                                                    be the one to take action. The best policy is to ask questions.
                                         Report Selective Reporting
                                                    You cannot credit studies in which a researcher reports his one statistically
                                                    significant result but fails to mention the reports of his other 25 analyses,
                                                    none of which came up significant. If you had known about all the other anal-
                                                    yses, you may have wondered whether this one statistically significant result
                                                    is truly meaningful, or simply due to chance (like the idea that a monkey
                                                    typing randomly on the typewriter would eventually write Shakespeare). It’s   329
                                                    a legitimate question.
                                                    The misleading practice of analyzing data until you find something is what
                                                    statisticians call data snooping or data fishing. Here’s an example: Suppose
                                                    Researcher Bob wants to figure out what causes first graders to argue with
                                                    each other so much in school (he must not be a parent or he wouldn’t even
                                                    try to touch this one!). He sets up a study in which he observes a classroom
                                                    of first graders every day for a month and records their every move. He gets
                                                    back to his office, enters all his data, hits a button that asks the computer
                                                    to perform every analysis known to man, and sits back in his chair eagerly
                                                    awaiting the results. After all, with all this data he’s bound to find something.
                                                    After poring through his results for several days, he hits pay dirt. He runs
                                                    out of his office and tells his boss he’s got to put out a press release saying a
                                                    ground-breaking study finds that first graders argue most when 1) the day of
                                                    the week ends in the letter y or 2) when the goldfish in their classroom aquar-
                                                    ium swims through the hole in its sunken pirate ship. Great job, Researcher
                                                    Bob! I’ve got a feeling that a month of watching a group of first graders took
                                                    the edge off his data analysis skills.
                                                    The bottom line is that if you collect enough data and analyze it long enough,
                                                    you’re bound to find something, but that something may be totally meaning-
                                                    less or just a fluke that’s not repeatable by other researchers.
                                                    How do you protect yourself against misleading results due to data fishing?
                                                    Find out more details about the study, starting with how many tests were
                                                    done in total, and how many of those tests were found to be non-significant.
                                                    In other words, get the whole story if you can, so that you can put the signifi-
                                                    cant results into perspective.








                                                                                                                           3/25/11   8:12 PM
                             29_9780470911082-ch20.indd   329                                                              3/25/11   8:12 PM
                             29_9780470911082-ch20.indd   329
   340   341   342   343   344   345   346   347   348   349   350