Page 230 - Statistics II for Dummies
P. 230

214        Part III: Analyzing Variance with ANOVA



                                Instead of calling the sum of squares for the regression model SST as is done
                                in ANOVA, statisticians call it SSR for sum of squares regression. Consider SSR
                                to be equivalent to the SST from ANOVA. You need to know the difference
                                because computer output lists the sums of squares for the regression model
                                as SSR, not SST.
                                To summarize the sums of squares as they apply to regression, you have
                                SSTO = SSR + SSE where

                                  ✓ SSTO measures the variability in the observed y-values around their
                                    mean. This value represents the variance of the y-values.
                                  ✓ SSE represents the variability between the predicted values for y (the
                                    values on the line) and the observed y-values. SSE represents the vari-
                                    ability left over after the line has been fit to the data.
                                  ✓ SSR measures the variability in the predicted values for y (the values on
                                    the line) from the mean of y. SSR is the sum of squares due to the regres-
                                    sion model (the line) itself.

                                Minitab calculates all the sums of squares for you as part of the regression
                                analysis. You can see this calculation in the section “Bringing regression to
                                the ANOVA table.”


                                Dividing up the degrees of freedom


                                In ANOVA, you test a model for the treatment (population) means by using an
                                F-test, which is     . To get MST (the mean sum of squares for treatment),
                                you take SST (the sum of squares for treatment) and divide by its degrees of
                                freedom. You do the same with MSE (that is, take SSE, the sum of squares for
                                error, and divide by its degrees of freedom). The questions now are, what do
                                those degrees of freedom represent, and how do they relate to regression?

                                Degrees of freedom in ANOVA
                                In ANOVA, the degrees of freedom for SSTO is n – 1, which represents the
                                sample size minus one. In the formula for SSTO,    , you see there are
                                n observed y-values minus one mean. In a very general way, that’s where the
                                n – 1 comes from.

                                Note that if you divide SSTO by n – 1, you get   , the variance in the
                                 y-values. This calculation makes good sense because the variance measures
                                the total variability in the y-values.













          18_466469-ch12.indd   214                                                                   7/24/09   9:45:33 AM
   225   226   227   228   229   230   231   232   233   234   235