Page 38 - Intermediate Statistics for Dummies
P. 38
05_045206 ch01.qxd 2/1/07 9:41 AM Page 17
Chapter 1: Beyond Number Crunching: The Art and Science of Data Analysis
Getting the bad news
As you can see in Figures 1-2a and 1-2b, Ellen’s data doesn’t follow the typical
bell-shaped curve. One of the problems is her data only takes on values that
are positive whole numbers, so numbers like 1.2, 2.3, and the like aren’t pos-
sible. (Normal distributions are supposed to have many possible values.) The
other problem is that the data has no values outside the typical two-, three-,
four-, or five-day range, so the histogram doesn’t have a chance to take on a
bell shape. Perhaps more data would have curbed this problem. At any rate,
Ellen knows that the conditions for a two-sample t-test aren’t met here;
namely that the data doesn’t have a normal distribution and is, in fact,
skewed (meaning set off to one side or the other).
Going nonparametric
Undaunted by this turn of events, Ellen employs a nonparametric test of her
data, which is the right thing to do. Statisticians use nonparametric statistics
in situations where the assumptions of the typical analyses aren’t met (like
not having a normal distribution). However, nonparametric stats often give 17
more conservative (albeit more accurate) results than the typical (paramet-
ric) procedures you’re used to using. (I discuss nonparametrics a bit more in
the last section of this chapter. Nonparametric procedures are discussed in
full detail in Chapters 16–19.)
Because Ellen’s data doesn’t have a normal distribution or even a symmetric
distribution (meaning one that looks the same on each side when you split it
down the middle), the mean (or average) isn’t a good measure of the center
of the data, so a two-sample t-test isn’t possible. As an alternative, she can
test whether the two histograms are the same or not, if she compares the his-
tograms of the two populations in question (all roses given water, versus all
roses given sugar water).
Because she’s comparing two groups, Ellen uses a Wilcoxon Rank Sum test,
also known as the Mann-Whitney test (see Chapter 19). The Wilcoxon Rank
Sum test checks whether two populations have the same distribution (mean-
ing whether the two histograms look the same) versus one of the populations
shifting to the right or left. Ellen’s theory is that the sugar group lasts longer,
so she tests Ho: Sugar group and control group have the same distribution
versus Ha: Sugar group is shifted to the right of the control group.
Ellen strikes out
To cut to the chase, the Wilcoxon Rank Sum test unfortunately fails to reject
Ellen’s null hypothesis. She didn’t prove what she wanted to confirm by her
experiment. Not enough roses in the sugar group lasted longer than those
roses in the control group. You can see the underlying reason for this result
by comparing the medians of the two groups. When you find the median of
each of the data sets in Table 1-1, you get the value of 4 in each case. Because
the medians of the two data sets are equal, it’s unlikely that Ellen can find a
statistically significant result by using this test.