Page 122 - Intermediate Statistics for Dummies
P. 122

10_045206 ch05.qxd  2/1/07  9:50 AM  Page 101
                                                Chapter 5: When Two Variables Are Better than One: Multiple Regression
                                         Predicting Y by Using the X Variables
                                                    By now, you should have your multiple regression model. You’re finally ready
                                                    to complete step six of the multiple regression analysis: to predict the value
                                                    of y given a set of values for the x variables. To make this prediction, you take
                                                    those x values for which you want to predict y, plug them into the multiple
                                                    regression model, and simplify.
                                                    In the ads and plasma TV sales example (see analysis from Figure 5-3), the
                                                    best-fitting model is y = 5.26 + 0.162x 1 + 0.249x 2 . In the context of the problem,
                                                    the model is Sales = 5.26 + 0.162 TV ad spending (x 1 ) + 0.249 newspaper ad
                                                    spending (x 2 ).
                                                    Remember that the units for plasma TV sales is in millions of dollars and the
                                                    units for ad spending for both TV and newspaper ads is in the thousands of
                                                    dollars. That is, $20,000 spent on TV ads means x 1 = 20 in the model. Similarly,
                                                    $10,000 spent on newspaper ads means x 2 = 10 in the model. Forgetting the  101
                                                    units can lead to serious miscalculations.
                                                    Suppose you want to estimate plasma TV sales if you spend $20,000 on TV
                                                    ads and $10,000 on newspaper ads. Plug x 1 = 20 and x 2 = 10 into the multiple
                                                    regression model, and you get y = 5.26 + 0.162(20) + 0.249(10) = 10.99. In other
                                                    words, if you spend $20,000 on TV advertising and $10,000 in newspaper
                                                    advertising, you estimate that sales will be $10.99 million dollars.
                                                    This estimate at least makes some sense in terms of the data shown in
                                                    Table 5-1. At location ten, they spent $20,000 on TV ads and $5,000 on news-
                                                    paper ads (short of what you had) and got sales of $9.82 million. Location
                                                    eleven spent a little more on TV ads and a lot more on newspaper ads than
                                                    what you had, and got sales of $16.28 million. Your spending amounts fall
                                                    between the amounts of locations ten and eleven, and your estimated sales
                                                    fall in between theirs also.
                                                    Be careful to put in only values for the x variables that fall in the range of
                                                    where the data lies. In other words, Table 5-1 shows data for TV ad spending
                                                    between $0 and $50,000; newspaper ad spending goes from $0 to $25,000. It
                                                    would not be appropriate, say, to try to estimate sales for spending amounts
                                                    of $75,000 for TV ads and $50,000 for newspaper ads, respectively. The reason
                                                    is that the regression model you came up with only fits the data that you col-
                                                    lected; you have no way of knowing whether that same relationship contin-
                                                    ues outside that area. This no-no of estimating y for values of the x variables
                                                    outside their range is called extrapolation. As one of my colleagues says,
                                                    “Friends don’t let friends extrapolate.”
   117   118   119   120   121   122   123   124   125   126   127