Page 218 - Intermediate Statistics for Dummies
P. 218

18_045206 ch12.qxd  2/1/07  10:17 AM  Page 197
                                                              Chapter 12: Rock My World: Relating Regression to ANOVA
                                                    suspect some variable is out there (call it x) that has some connection to the
                                                    y variable, and that variable can help you make more sense out of this seem-
                                                    ingly wide range of y-values.
                                                    For example, if you record the calories for five types of candy bars as 100,
                                                    200, 300, 400, and 500, you would say “Wow, that’s a lot of variation in calo-
                                                    ries; I wonder why that is?” Then you notice that the weights of the candy
                                                    bars are 1, 2, 3, 4, and 5 ounces, respectively. This relationship can be
                                                    expressed as y = 100x, where y equals calories and x equals weight.
                                                    Now you can look at what before was a bunch of variability in the y-values
                                                    and say, “Hey, that’s not just random variability; the differing y-values can
                                                    be explained by the weight of candy bar (x).” You can now use x in a nice
                                                    regression model to estimate y. Notice that you’re talking about splitting the
                                                    total variability in the y’s into the part due to x and the part due to chance
                                                    (error). That’s ANOVA language! Hey, perhaps regression and ANOVA are
                                                    related after all . . .
                                                    To continue with the Internet use example, suppose you have a brainstorm  197
                                                    that number of years of education could possibly be related to Internet use.
                                                    In this case, the explanatory variable (input variable, x) is years of education,
                                                    and you want to use it to try to estimate y, the number of hours on the
                                                    Internet in a month. You take a larger random sample of 250 Internet users
                                                    and ask them how many years of education they had (so n = 250). You can
                                                    check out the first ten observations from your data set containing the (x, y)
                                                    pairs in Table 12-1. If a significant connection of some sort exists between the
                                                    x-values and the y-values, then you can say that x is helping to explain some
                                                    of the variability in the y’s. If it explains enough variability, you can place x
                                                    into a simple regression model and use it to estimate y.
                                                      Table 12-1       First Ten Observations from the Education
                                                                               and Internet Use Example
                                                      Years of Education    Hours on Internet (For One Month)
                                                      15                    41
                                                      15                    32
                                                      11                    33
                                                      10                    42
                                                      10                    28
                                                      10                    21
                                                                                                               (continued)
   213   214   215   216   217   218   219   220   221   222   223