Page 226 - Intermediate Statistics for Dummies
P. 226

18_045206 ch12.qxd  2/1/07  10:20 AM  Page 205
                                                              Chapter 12: Rock My World: Relating Regression to ANOVA
                                                    freedom, respectively. (In the Internet example, the p-value listed in the last
                                                    column of the ANOVA table is 0.000, meaning the regression model fits.) But
                                                    remember, in regression you don’t use an F-statistic and an F-test. You use a
                                                    t-statistic and a t-test. What gives? The next section explains.
                                                    Relating the F- and t-statistics:
                                                    The final frontier
                                                    In regression, one way of testing whether the best-fitting line is statistically
                                                    significant is to test Ho: slope = 0 versus Ha: slope ≠ 0. To do this, you use a
                                                    t-test (see Chapter 3). The slope is the heart and soul of the regression line,
                                                    because it describes the main part of the relationship between x and y. If the
                                                    slope of the line equals zero (you can’t reject Ho), you’re just left with y = b 1 ,
                                                    a horizontal line, and your model y = b 0 + b 1 x isn’t doing anything for you.
                                                    In ANOVA, you test to see whether the model fits by testing Ho: The means of  205
                                                    the populations are all equal, versus Ha: At least two of the population means
                                                    aren’t equal. To do this you use an F-test (taking MST and dividing it by MSE;
                                                    see Chapter 10).
                                                    The sets of hypotheses in regression and ANOVA seem totally different, but in
                                                    essence, they’re both doing the same general thing: testing whether a certain
                                                    model fits. In the regression case, the model you want to see fit is the straight
                                                    line, and in the ANOVA case, the model of interest is a set of (normally distrib-
                                                    uted) populations with at least two different means (and the same variance).
                                                    Here each population is labeled as a treatment by ANOVA.
                                                    But more than that, you can think of it this way: Suppose you took all the
                                                    populations from the ANOVA and lined them up side by side on an x-y plane
                                                    (see Figure 12-2). If the means of those distributions are all connected by a
                                                    flat line (representing the mean of the y’s), then you would have no evidence
                                                    against Ho in the F-test, so you can’t reject it — your model isn’t doing any-
                                                    thing for you (it doesn’t fit). This idea is similar to the idea of fitting a flat hor-
                                                    izontal line through the y-values in regression; a straight-line model with a
                                                    nonzero slope doesn’t work in that case.
                                                    The big thing is that statisticians can prove (so you don’t have to) that an
                                                    F-statistic is equivalent to the square of a t-statistic, and the F-distribution is
                                                    equivalent to the square of a t-distribution when the SSR has df = 2 – 1 = 1.
                                                    And when you have a simple linear regression model, the degrees of freedom
                                                    is exactly one! (Note that F is always greater than or equal to zero, which is
                                                    needed if you’re making it the square of something.) So there you have it! The
                                                    t-statistic for testing the regression model is equivalent to an F-statistic for
                                                    ANOVA when the ANOVA table is formed for the simple regression model.
   221   222   223   224   225   226   227   228   229   230   231