Page 126 - Intermediate Statistics for Dummies
P. 126
10_045206 ch05.qxd 2/1/07 9:50 AM Page 105
Chapter 5: When Two Variables Are Better than One: Multiple Regression
As an alternative check for normality apart from using the regular residuals,
you can look at the standardized residuals plot (Figure 5-5) and check out the
upper-right plot. It shows how the residuals are distributed across the vari-
ous estimated (fitted) values of y. Standardized residuals are supposed to
follow a standard normal distribution. That is, they should have mean zero
and standard deviation one. So when you look at the standardized residuals,
they should be centered around zero in a way that has no predictable pat-
tern, with the same amount of variability around the horizontal line that
crosses at zero as you move from left to right.
You should also find looking at the upper-right plot of Figure 5-5 that most (95
percent) of the standardized residuals fall within two standard deviations of
the mean, which in this case is –2 to +2 (via the 68-95-99.7 Rule — remember
that from intro stats?). You should see more residuals hovering around zero
(where the middle lump would be on a standard normal distribution), and
you should have fewer and fewer of the residuals as you go away from zero.
The upper-right plot in Figure 5-5 confirms a normal distribution for the ads
and sales example on all the counts I just mentioned. 105
The lower-left plot of Figures 5-4 and 5-5 show histograms of the regular
and standardized residuals, respectively. These histograms should reflect a
normal distribution; that is, the shape of the histograms should be approxi-
mately symmetric and look like a bell-shaped curve. Note that if the data set
is small (as is the case here with only 22 observations), the histogram may
not be as close to normal as you would like; in that case, consider it part of
the body of evidence that all four residual plots show you. The histograms
shown in the lower-left plots of Figure 5-4 and 5-5 aren’t terribly normal look-
ing; however, because you can’t see any glaring problems with the upper-
right plots, don’t be worried.
Satisfying the second condition
To look at the variance issue (condition two from a previous section), you can
look again at the upper-right plot of Figure 5-4 (or Figure 5-5). You shouldn’t
see any change in the amount of spread (variability) in the residuals around
that horizontal line as you move from left to right. Looking at Figure 5-4, the
upper-right graph, you can see no reason to say that condition number two
(the residuals have the same variance for each combination of the x vari-
ables) hasn’t been met.
One particular problem that raises a red flag is if the residuals fan out, or
increase in spread, as you move from left to right on the upper-right plot.
This fanning out means that the variability increases more and more for
higher and higher predicted values of y, so the condition of equal variability
around the fitted line isn’t met, and the regression model would not fit well in
that case.