Page 231 - Statistics II for Dummies

P. 231

Chapter 12: Regression and ANOVA: Surprise Relatives! 215

Degrees of freedom in regression
The degrees of freedom for SST in ANOVA equal the number of treatments
minus one. How does the degrees of freedom idea relate to regression? The
number of treatments in regression is equivalent to the number of param-
eters in a model (a parameter being an unknown constant in the model that
you’re trying to estimate).
When you test a model, you’re always comparing it to a different (simpler)
model to see whether it fits the data better. In linear regression, you com-
pare your regression line y = a + bx, to the horizontal line y = . This second,
simpler model just uses the mean of y to predict y all the time, no matter
what x is. In the regression line, you have two coefficients: one to estimate
the parameter for the y-intercept (a) and one to estimate the parameter for
slope (b) in the model. In the second, simpler model, you have only one
parameter: the value of the mean. The degrees of freedom for SSR in simple
linear regression is the difference in the number of parameters from the two
models: 2 – 1 = 1.
The degrees of freedom for SSE in ANOVA is n – k. In the formula for SSE,
, you see there are n predicted y-values, and k is the number of
treatments in the model. In regression, the number of parameters in the
model is k = 2 (the slope and the y-intercept). So you have degrees of free-
dom n – 2 associated with SSE when you’re doing regression.

Putting all this together, the degrees of freedom for regression must add up
for the equation SSTO = SSR + SSE. The degrees of freedom corresponding to
this equation are (n – 1) = (2 – 1) + (n – 2), which is true if you do the math. So
the degrees of freedom for regression, using the ANOVA approach, all check
out. Whew!
In Figure 12-1, you can see the degrees of freedom for each sum of squares
listed under the DF column of the ANOVA part of the output. You see SSR
has 2 – 1 = 1 degree of freedom, SSE has 250 – 2 = 248 degrees of freedom
(because n = 250 observations were in the data set and k = 2 and you find
n – k to get degrees of freedom for SSE). The degrees of freedom for SSTO is
250 – 1 = 249.

Bringing regression to the ANOVA table

In ANOVA, you test your model Ho: All k population means are equal versus
Ha: At least two population means are different by using a F-test. You build
your F-test statistic by relating the sums of squares for treatment to the sum
of squares for error. To do this, you divide SSE and SST by their degrees of

18_466469-ch12.indd 215 7/24/09 9:45:34 AM

226 227 228 229 230 231 232 233 234 235 236