Page 128 - Intermediate Statistics for Dummies
P. 128
11_045206 ch06.qxd 2/1/07 9:52 AM Page 107
Chapter 6
One Step Forward and Two
Steps Back: Regression
Model Selection
In This Chapter
Evaluating different methods for choosing a multiple regression model
Understanding how forward selection and backward selection works
Using the best subsets methods to find a good model
uppose you’re trying to estimate some quantitative variable, y, and you
Shave many x variables available at your disposal. You have so many
variables related to y, in fact, that you feel like I do in my job every day —
overwhelmed with opportunity. Where do you go? What do you do? Never
fear, this chapter is for you.
In this chapter, you see three different procedures statisticians use to find a
best possible model — forward selection, backward selection, and best sub-
sets selection. Each procedure can lead you to a different final model, and you
can’t find one single procedure that everyone agrees is the one to use. Each
selection method has positives and negatives associated with it, as you can see
in this chapter. No matter what method you choose, each method has the same
goal: to get the best possible model for y by using a set of x variables. Yet the
road that each procedure takes to get there is a bit different, so read on!
Note that the term best has many connotations here. You can’t find one end-
all-be-all model that everyone comes up with in the end. That is to say that
each data analyst can come up with a different model, and each model still
does a good job of predicting y.