Page 187 - Intermediate Statistics for Dummies
P. 187
15_045206 ch09.qxd 2/1/07 10:12 AM Page 166
166
Part III: Comparing Many Means with ANOVA
Checking off independence
To check the first condition, examine how the data was collected from each
of the separate populations. In order to maintain independence, the out-
comes from one population can’t affect the outcomes of the other popula-
tions. If the data has been collected by using a separate random sample from
each population (random here meaning that each individual in the population
had an equal chance of being selected), this factor ensures independence at
the strongest level.
In the watermelon seed spitting data (see Table 9-1), the data aren’t randomly
sampled from each age group because the data represents everyone who
participated in the contest. But, you can argue that the seed spitting dis-
tances from one age group don’t affect the seed spitting distances from the
other age groups, so the independence assumption is okay here also.
Looking for what’s normal
The second ANOVA condition is that each of the k populations has a normal
distribution. To check this condition, make a separate histogram of the data
from each group and see whether it resembles a normal distribution. Data
from a normal distribution should look symmetric (in other words, if you
split the histogram down the middle, it looks the same on each side) and
have a bell-shape. Don’t expect the data in each histogram to follow a
normal distribution exactly (remember it’s only a sample), but it shouldn’t
be extremely different from a normal, bell-shaped distribution.
Since the data contains only five children per age group, checking conditions
can be iffy. But in this case, you have past data for 200 children in each age
group, so you can use that to check the conditions. The histograms and descrip-
tive statistics of the seed spitting data for the four age groups are shown in
Figure 9-2, all in one panel, so you can easily compare them to each other on the
same scale. Looking at the four histograms in Figure 9-2, you can see that each
graph resembles a bell shape; the normality condition isn’t being violated here.
(Red flags should come up if you see two peaks in the data, or a skewed shape
where the peak is off to one side, or if the histogram is flat, for example.)
You can use Minitab to make histograms for each of your samples and have
all of them appear on one large panel, all using the same scale. To do this, go
to Graph>Histogram and click OK. Choose the variables that represent data
from each sample by highlighting them in the left-hand box and clicking
Select. Then click on Multiple Graphs, and a new window opens. Under the
Show Graph Variables option, check the following box: In separate panels of
the same graph. On the Same Scales for Graphs option, check the box for x
and the box for y. This option gives you the same scale on both the x and y
axes for all the histograms. Then click OK.