Page 126 - Intermediate Statistics for Dummies
P. 126

10_045206 ch05.qxd  2/1/07  9:50 AM  Page 105
                                                Chapter 5: When Two Variables Are Better than One: Multiple Regression
                                                    As an alternative check for normality apart from using the regular residuals,
                                                    you can look at the standardized residuals plot (Figure 5-5) and check out the
                                                    upper-right plot. It shows how the residuals are distributed across the vari-
                                                    ous estimated (fitted) values of y. Standardized residuals are supposed to
                                                    follow a standard normal distribution. That is, they should have mean zero
                                                    and standard deviation one. So when you look at the standardized residuals,
                                                    they should be centered around zero in a way that has no predictable pat-
                                                    tern, with the same amount of variability around the horizontal line that
                                                    crosses at zero as you move from left to right.
                                                    You should also find looking at the upper-right plot of Figure 5-5 that most (95
                                                    percent) of the standardized residuals fall within two standard deviations of
                                                    the mean, which in this case is –2 to +2 (via the 68-95-99.7 Rule — remember
                                                    that from intro stats?). You should see more residuals hovering around zero
                                                    (where the middle lump would be on a standard normal distribution), and
                                                    you should have fewer and fewer of the residuals as you go away from zero.
                                                    The upper-right plot in Figure 5-5 confirms a normal distribution for the ads
                                                    and sales example on all the counts I just mentioned.                 105
                                                    The lower-left plot of Figures 5-4 and 5-5 show histograms of the regular
                                                    and standardized residuals, respectively. These histograms should reflect a
                                                    normal distribution; that is, the shape of the histograms should be approxi-
                                                    mately symmetric and look like a bell-shaped curve. Note that if the data set
                                                    is small (as is the case here with only 22 observations), the histogram may
                                                    not be as close to normal as you would like; in that case, consider it part of
                                                    the body of evidence that all four residual plots show you. The histograms
                                                    shown in the lower-left plots of Figure 5-4 and 5-5 aren’t terribly normal look-
                                                    ing; however, because you can’t see any glaring problems with the upper-
                                                    right plots, don’t be worried.
                                                    Satisfying the second condition
                                                    To look at the variance issue (condition two from a previous section), you can
                                                    look again at the upper-right plot of Figure 5-4 (or Figure 5-5). You shouldn’t
                                                    see any change in the amount of spread (variability) in the residuals around
                                                    that horizontal line as you move from left to right. Looking at Figure 5-4, the
                                                    upper-right graph, you can see no reason to say that condition number two
                                                    (the residuals have the same variance for each combination of the x vari-
                                                    ables) hasn’t been met.
                                                    One particular problem that raises a red flag is if the residuals fan out, or
                                                    increase in spread, as you move from left to right on the upper-right plot.
                                                    This fanning out means that the variability increases more and more for
                                                    higher and higher predicted values of y, so the condition of equal variability
                                                    around the fitted line isn’t met, and the regression model would not fit well in
                                                    that case.
   121   122   123   124   125   126   127   128   129   130   131