Page 110 - Intermediate Statistics for Dummies
P. 110
10_045206 ch05.qxd 2/1/07 9:49 AM Page 89
Chapter 5: When Two Variables Are Better than One: Multiple Regression
Stepping through the analysis
Your job in conducting a multiple regression analysis is to do the following
(the computer can help you do steps three through six):
1. Come up with a list of possible x variables that may be helpful in
estimating y.
2. Collect data on the y variable and your x variables from step one.
3. Check the relationships between each x variable and y (using scatter-
plots and correlations) and use the results to eliminate those x vari-
ables that aren’t strongly related to y.
4. Look at possible relationships between the x variables themselves to
make sure you aren’t being redundant (in statistical terms, you’re
trying to avoid the problem of multicolinearity).
If two x variables relate to y the same way, you don’t need both in the
model. 89
5. Use those x variables (from step four) in a multiple regression analysis
to find the best-fitting model for your data.
6. Use the best-fitting model (step five) to predict y for given x-values by
plugging those x-values into the model.
I outline each of these steps in the sections to follow.
Looking at X’s and Y’s
The first step of a multiple regression analysis comes way before the number
crunching on the computer; it occurs even before the data is collected. Step
one is where you sit down and think about what variables may be useful in pre-
dicting your response variable y. This step will likely take more time than any
other step, except maybe the data-collection process. Deciding which x vari-
ables may be candidates for consideration in your model is a deal-breaking
step, because you can’t go back and collect more data after the analysis
is over.
Always check to be sure that your response variable, y, and at least one of
the x variables are quantitative. For example, if y isn’t quantitative but at
least one x is, a logistic regression model may be in order (see Chapter 8).
@Spy