Page 224 - Intermediate Statistics for Dummies
P. 224

18_045206 ch12.qxd  2/1/07  10:19 AM  Page 203
                                                              Chapter 12: Rock My World: Relating Regression to ANOVA
                                                    freedom. You do the same with MSE (that is, take SSE, the sum of squares
                                                    for error, and divide by its degrees of freedom). The question now is, what do
                                                    those degrees of freedom represent and how do they relate to regression?
                                                    This section addresses that issue.
                                                    Degrees of freedom in ANOVA
                                                    In ANOVA, the degrees of freedom for SSTO is n – 1, which represents the
                                                                                                     2
                                                    sample size minus one. In the formula for SSTO, Σ _
                                                                                                   y
                                                                                                y i - i , you see there are
                                                    n observed y-values minus one mean. That in a very general way is where the
                                                    n – 1 comes from.
                                                                                                   2
                                                                                              y i - i
                                                                                           Σ _
                                                                                                 y
                                                    Note that if you divide SSTO by n – 1, you get
                                                                                                    , the variance in the
                                                                                                1
                                                                                             n -
                                                    y-values. This calculation makes good sense because the variance also mea-
                                                    sures the total variability in the y-values.
                                                                                                                 2
                                                                                                             /
                                                    The degrees of freedom for SSE is n – k. In the formula for SSE, Σ c
                                                                                                            y - m ,
                                                                                                               y
                                                                                                             i
                                                    you see there are n observed y-values, and k is the number of treatments in  203
                                                    the model. In regression, the number of coefficients in the model is k = 2 (the
                                                    slope and the y-intercept). So you have degrees of freedom n – 2 associated
                                                    with SSE when you’re doing regression.
                                                    Degrees of freedom in regression
                                                    The degrees of freedom for SST in ANOVA equals the number of treatments
                                                    minus one. How does the degrees of freedom idea relate to regression? The
                                                    number of treatments in regression is equivalent to the number of parame-
                                                    ters in a model (a parameter being an unknown constant in the model that
                                                    you’re trying to estimate).
                                                    When you test a model you’re always comparing it to a different (simpler)
                                                    model to see whether it fits the data better. In linear regression you compare
                                                    your regression line y = b 0 + b 1 x, to the horizontal line y =  y. This second, sim-
                                                    pler model just uses the mean of y to predict y all the time, no matter what x
                                                    is. In the regression line, you have two coefficients: one to estimate the para-
                                                    meter for the y-intercept (b 0 ) and one to estimate the parameter for slope
                                                    (b 1 ) in the model. In the second, simpler model, you have only one parameter:
                                                    the value of the mean. The degrees of freedom for SSR in simple linear regres-
                                                    sion is the difference in the parameters of the two models: 2 – 1 = 1.
                                                    Putting all this together, the degrees of freedom for regression must add up
                                                    for the equation SSTO = SSR + SSE. The degrees of freedom corresponding to
                                                    this equation are (n – 1) = (2 – 1) + (n – 2), which is true if you do the math. So
                                                    the degrees of freedom for regression, using the ANOVA approach, all check
                                                    out. Whew!
   219   220   221   222   223   224   225   226   227   228   229