Page 65 - Statistics for Dummies
P. 65
Chapter 4: Tools of the Trade
Some of the biggest culprits of statistical misrepresentation caused by bad
sampling are surveys done on the Internet. You can find thousands of surveys
on the Internet that are done by having people log on to a particular Web site
and give their opinions. But even if 50,000 people in the U.S. complete a survey
on the Internet, it doesn’t represent the population of all Americans. It repre-
sents only those folks who have Internet access, who logged on to that particu-
lar Web site, and who were interested enough to participate in the survey
(which typically means that they have strong opinions about the topic in ques-
tion). The result of all these problems is bias — systematic favoritism of certain
individuals or certain outcomes of the study.
How do you select a sample in a way that avoids bias? The key word is
random. A random sample is a sample selected by equal opportunity; that
is, every possible sample the same size as yours had an equal chance to be
selected from the population. What random really means is that no group in
the population is favored in or excluded from the selection process.
Non-random (in other words bad) samples are samples that were selected in 49
such a way that some type of favoritism and/or automatic exclusion of a part
of the population was involved. A classic example of a non-random sample
comes from polls for which the media asks you to phone in your opinion on
a certain issue (“call-in” polls). People who choose to participate in call-in
polls do not represent the population at large because they had to be watch-
ing that program, and they had to feel strongly enough to call in. They tech-
nically don’t represent a sample at all, in the statistical sense of the word,
because no one selected them beforehand — they selected themselves to
participate, creating a volunteer or self-selected sample. The results will be
skewed toward people with strong opinions.
To take an authentic random sample, you need a randomizing mechanism
to select the individuals. For example, the Gallup Organization starts with a
computerized list of all telephone exchanges in America, along with estimates
of the number of residential households that have those exchanges. The com-
puter uses a procedure called random digit dialing (RDD) to randomly create
phone numbers from those exchanges, and then selects samples of telephone
numbers from those. So what really happens is that the computer creates a
list of all possible household phone numbers in America and then selects a
subset of numbers from that list for Gallup to call.
Another example of random sampling involves the use of random number
generators. In this process, the items in the sample are chosen using a
computer-generated list of random numbers, where each sample of items
has the same chance of being selected. Researchers may use this type of ran-
domization to assign patients to a treatment group versus a control group in
an experiment. This process is equivalent to drawing names out of a hat or
drawing numbers in a lottery.
3/25/11 8:17 PM
08_9780470911082-ch04.indd 49 3/25/11 8:17 PM
08_9780470911082-ch04.indd 49