Page 180 - Statistics II for Dummies
P. 180
164
Part III: Analyzing Variance with ANOVA
Breaking down the variance
into sums of squares
Step one of the F-test is splitting up the variability in the y variable into
portions that define where the variability is coming from. Each portion of
variability is called a sum of squares. The term analysis of variance is a great
description for exactly how you conduct a test of k population means. With
the overall goal of testing whether k population (or treatment) means are
equal, you take a random sample from each of the k populations.
You first put all the data together into one big group and measure how much
total variability there is; this variability is called the sums of squares total, or
SSTO. If the data are really diverse, SSTO is large. If the data are very similar,
SSTO is small.
You can split the total variability in the combined data set (SSTO) into two
parts:
✓ SST: The variability between the groups, known as the sums of squares
for treatment
✓ SSE: The variability within the groups, known as the sums of squares
for error
Splitting up the variability in your data results in one of the most important
equalities in ANOVA:
SSTO = SST + SSE
2
The formula for SSTO is the numerator of the formula for s , the variance of a
th
single data set, so , where i and j represent the j value in
th
the sample from the i population and is the overall sample mean (the mean
of the entire data set). So, in terms of ANOVA, SSTO is the total squared dis-
tance between the data values and their overall mean.
The formula for SST is , where n is the size of the sample
i
th
coming from the i population and is the overall sample mean. SST repre-
sents the total squared distance between the means from each sample and
the overall mean.
th
The formula for SSE is , where x is the j value in the
ij
th
sample from the i population and is the mean of the sample coming
th
from the i population. This formula represents the total squared distance
between the values in each sample and their corresponding sample means.
Using algebra, you can confirm (with some serious elbow grease) that SSTO =
SST + SSE.
7/23/09 9:31:29 PM
15_466469-ch09.indd 164 7/23/09 9:31:29 PM
15_466469-ch09.indd 164