Page 65 - Statistics for Dummies
P. 65

Chapter 4: Tools of the Trade
                                                    Some of the biggest culprits of statistical misrepresentation caused by bad

                                                    sampling are surveys done on the Internet. You can find thousands of surveys
                                                    on the Internet that are done by having people log on to a particular Web site
                                                    and give their opinions. But even if 50,000 people in the U.S. complete a survey
                                                    on the Internet, it doesn’t represent the population of all Americans. It repre-
                                                    sents only those folks who have Internet access, who logged on to that particu-
                                                    lar Web site, and who were interested enough to participate in the survey
                                                    (which typically means that they have strong opinions about the topic in ques-
                                                    tion). The result of all these problems is bias — systematic favoritism of certain
                                                    individuals or certain outcomes of the study.
                                                   How do you select a sample in a way that avoids bias? The key word is
                                                    random. A random sample is a sample selected by equal opportunity; that
                                                    is, every possible sample the same size as yours had an equal chance to be
                                                    selected from the population. What random really means is that no group in
                                                    the population is favored in or excluded from the selection process.
                                                    Non-random (in other words bad) samples are samples that were selected in   49
                                                    such a way that some type of favoritism and/or automatic exclusion of a part
                                                    of the population was involved. A classic example of a non-random sample
                                                    comes from polls for which the media asks you to phone in your opinion on
                                                    a certain issue (“call-in” polls). People who choose to participate in call-in
                                                    polls do not represent the population at large because they had to be watch-
                                                    ing that program, and they had to feel strongly enough to call in. They tech-
                                                    nically don’t represent a sample at all, in the statistical sense of the word,
                                                    because no one selected them beforehand — they selected themselves to
                                                    participate, creating a volunteer or self-selected sample. The results will be
                                                    skewed toward people with strong opinions.
                                                    To take an authentic random sample, you need a randomizing mechanism
                                                    to select the individuals. For example, the Gallup Organization starts with a
                                                    computerized list of all telephone exchanges in America, along with estimates
                                                    of the number of residential households that have those exchanges. The com-
                                                    puter uses a procedure called random digit dialing (RDD) to randomly create
                                                    phone numbers from those exchanges, and then selects samples of telephone
                                                    numbers from those. So what really happens is that the computer creates a
                                                    list of all possible household phone numbers in America and then selects a
                                                    subset of numbers from that list for Gallup to call.
                                                    Another example of random sampling involves the use of random number
                                                    generators. In this process, the items in the sample are chosen using a
                                                    computer-generated list of random numbers, where each sample of items
                                                    has the same chance of being selected. Researchers may use this type of ran-
                                                    domization to assign patients to a treatment group versus a control group in
                                                    an experiment. This process is equivalent to drawing names out of a hat or
                                                    drawing numbers in a lottery.





                                                                                                                           3/25/11   8:17 PM
                             08_9780470911082-ch04.indd   49                                                               3/25/11   8:17 PM
                             08_9780470911082-ch04.indd   49
   60   61   62   63   64   65   66   67   68   69   70