Page 129 - Statistics II for Dummies
P. 129
Chapter 6: How Can I Miss You If You Won’t Leave? Regression Model Selection
The secret to a punter’s success: An example 113
Returning to the punt distance example from earlier in this chapter, suppose
that you analyzed the punt distance data by using the best subsets model
selection procedure. Your results are shown in Figure 6-2. This section fol-
lows Minitab’s footsteps in getting these results and provides you with a
guide for interpreting the results.
Assuming that you already used Minitab to carry out the best subsets selec-
tion procedure on the punt distance data, you can now analyze the output
from Figure 6-2. Each variable shows up as a column on the right side of the
output. Each row represents the results from a model containing the number
of variables shown in column one. The X’s at the end of each row tell you
which variables were included in that model. The number of variables in the
model starts at 1 and increases to 6 because six x variables are available in
the data set.
The models with the same number of variables are ordered by their values
2
of R adjusted and Mallows C-p, from best to worst. The top-two models (for
each number of variables) are included in the computer output.
For example, rows one and two of Figure 6-2 (both marked 1 in the Vars
column) show the top-two models containing one x variable; rows three and
four show the top two models containing two x variables; and so on. Finally,
the last row shows the results of the full model containing all six variables.
(Only one model contains all six variables, so you don’t have a second-best
model in this case.)
Looking at the first two rows of Figure 6-2, the top one-variable model is the
one including hang time only. The second-best one-variable model includes
only right foot flexibility. The right foot flexibility model has a lower value of
2
R and a higher Mallow’s C-p than the hang time model, which is why it’s the
second best.
Row three shows that the best two-variable model for estimating punt dis-
tance is the model containing right leg strength and overall leg strength.
The best three-variable model is in row five; it shows that the best three-
variable model includes right foot strength, right foot flexibility, and overall
leg strength. The best four-variable model is found in row seven and includes
right foot strength, right and left foot flexibility, and overall foot strength.
The best five-variable model is found in row nine and includes every vari-
able except left foot strength. The only six-variable model with all variables
included is listed in the last row.
7/23/09 9:27:04 PM
11_466469-ch06.indd 113 7/23/09 9:27:04 PM
11_466469-ch06.indd 113

