Page 223 - Intermediate Statistics for Dummies
P. 223
18_045206 ch12.qxd 2/1/07 10:18 AM Page 202
202
Part III: Comparing Many Means with ANOVA
Now that you have calculated SSTO and SSE, you need the bridge between
them. That is, you need a formula that connects the variability in the y i ’s
(SSTO) and the variability in the residuals after fitting the regression line
/
(SSE). That bridge is SSR (equivalent to SST in ANOVA). In regression, y rep-
i
resents the predicted value of y i based on the regression model. These are
the values on the regression line. To assess how much this regression line
helps to predict the y-values, you compare it to the model you would get
without any x variable in it.
Without any other information, the only thing you can do to predict y is look
at the average, y. So, SST compares the predicted value from the regression
line to the predicted value from the flat line (the mean of the y’s) by subtract-
/
c
y
ing them. The result is y - m. Square each result and sum them all up, and
i
you get the formula for SST.
Now for one last hoop to jump through (as if you haven’t had enough
already). Instead of calling the sum of squares for the regression model SST
as is done in ANOVA, statisticians call it SSR for sum of squares regression.
Consider SSR from regression to be equivalent to the SST from ANOVA.
The reason this is important is because computer output lists the sums of
squares for the regression model as SSR not SST.
To summarize the sums of squares as they apply to regression, you have
SSTO = SSR + SSE where
SSTO measures the variability in the observed y-values around their
mean. This value represents the variance of the y-values.
SSE represents the variability between the predicted values for y (the
values on the line) and the observed y-values. SSE represents the vari-
ability left over after the line has been fit to the data.
SSR measures the variability in the predicted values for y (the values on
the line) from the mean of y. SSR is the sum of squares due to the regres-
sion model (the line) itself.
Minitab calculates all the sums of squares for you as part of the regression
analysis. You can see this calculation in the section “Bringing regression to
the ANOVA table.”
Dividing up the degrees of freedom
In ANOVA, you test a model for the treatment (population) means by using an
MST
F-test, which is F = . To get MST (the mean sum of squares for treatment),
MSE
you take SST (the sum of squares for treatment) and divide by its degrees of

