Page 231 - Statistics II for Dummies
P. 231

Chapter 12: Regression and ANOVA: Surprise Relatives!  215


                                Degrees of freedom in regression
                                The degrees of freedom for SST in ANOVA equal the number of treatments
                                minus one. How does the degrees of freedom idea relate to regression? The
                                number of treatments in regression is equivalent to the number of param-
                                eters in a model (a parameter being an unknown constant in the model that
                                you’re trying to estimate).
                                When you test a model, you’re always comparing it to a different (simpler)
                                model to see whether it fits the data better. In linear regression, you com-
                                pare your regression line y = a + bx, to the horizontal line y =  . This second,
                                simpler model just uses the mean of y to predict y all the time, no matter
                                what x is. In the regression line, you have two coefficients: one to estimate
                                the parameter for the y-intercept (a) and one to estimate the parameter for
                                slope (b) in the model. In the second, simpler model, you have only one
                                parameter: the value of the mean. The degrees of freedom for SSR in simple
                                linear regression is the difference in the number of parameters from the two
                                models: 2 – 1 = 1.
                                The degrees of freedom for SSE  in ANOVA is n – k. In the formula for SSE,
                                         , you see there are n predicted y-values, and k is the number of
                                treatments in the model. In regression, the number of parameters in the
                                model is k = 2 (the slope and the y-intercept). So you have degrees of free-
                                dom n – 2 associated with SSE when you’re doing regression.

                                Putting all this together, the degrees of freedom for regression must add up
                                for the equation SSTO = SSR + SSE. The degrees of freedom corresponding to
                                this equation are (n – 1) = (2 – 1) + (n – 2), which is true if you do the math. So
                                the degrees of freedom for regression, using the ANOVA approach, all check
                                out. Whew!
                                In Figure 12-1, you can see the degrees of freedom for each sum of squares
                                listed under the DF column of the ANOVA part of the output. You see SSR
                                has 2 – 1 = 1 degree of freedom, SSE has 250 – 2 = 248 degrees of freedom
                                (because n = 250 observations were in the data set and k = 2 and you find
                                n – k to get degrees of freedom for SSE). The degrees of freedom for SSTO is
                                250 – 1 = 249.


                                Bringing regression to the ANOVA table


                                In ANOVA, you test your model Ho: All k population means are equal versus
                                Ha: At least two population means are different by using a F-test. You build
                                your F-test statistic by relating the sums of squares for treatment to the sum
                                of squares for error. To do this, you divide SSE and SST by their degrees of












          18_466469-ch12.indd   215                                                                   7/24/09   9:45:34 AM
   226   227   228   229   230   231   232   233   234   235   236