Page 141 - Intermediate Statistics for Dummies
P. 141
11_045206 ch06.qxd 2/1/07 9:52 AM Page 120
120
Part II: Making Predictions by Using Regression
If the x variable is statistically significant (its p-value is less than the pre-
selected α level), it makes a significant contribution to determining y,
given that the rest of the variables in the model are fixed. In that case,
that x variable remains a possible candidate for inclusion in the model
at this point. If the x variable isn’t statistically significant, then it is con-
sidered for removal at this particular point.
4. Find the variable with the largest p-value on the Minitab output.
This variable is the one that has the least contribution toward y given
the rest of the variables in the model.
5. If the p-value for the variable found in step four is larger than the
removal level, then remove the variable from the model.
6. Repeat steps three through five on the new model, removing one vari-
able at a time; after the largest p-value from step four falls below the
removal level, stop the backward selection process and don’t remove
that variable or more variables.
You now have your final model, which will include some subset of x vari-
ables from the full model in step two.
To find a best multiple linear regression model by using the backward selec-
tion procedure in Minitab, go to Stat>Regression>Stepwise. Highlight the
variable that is the response (y) variable, and click Select. Then highlight
the variables that are the predictor (x) variables, and click Select. Click on
Methods, and choose Backward Selection. Choose the α to remove (the
removal level for a variable chosen by you). The F-value for removal has a
default at 4.0, which should be fine. Click OK, and you get the output for the
backward selection procedure similar to Figure 6-4.
Assessing model fit
The fit of the models at each stage of the backward selection procedure
are the same as those for the forward selection procedure in the previous
2
section. The computer output shows you the value of R , the value of R 2
adjusted, and Mallow’s C-p. (See an earlier section “How well does the model
fit?” for more information on each of these measures.)
Kicking variables out to
estimate punt distance
This section applies the backward selection procedure to the punt distance
data so you can see how the process works and how to interpret the results