Page 303 - Statistics for Dummies
P. 303

Chapter 18: Looking for Links: Correlation and Regression

                                                    Never do a regression analysis unless you have already found at least a mod-
                                                    erately strong correlation between the two variables. (My rule of thumb is it
                                                    should be at or beyond either positive or negative 0.50, but other statisticians
                                                    may have different criteria.) I’ve seen cases where researchers go ahead and
                                                    make predictions when a correlation is as low as 0.20! By anyone’s standards,
                                                    that doesn’t make sense. If the data don’t resemble a line to begin with, you
                                                    shouldn’t try to use a line to fit the data and make predictions (but people
                                                    still try).
                                                    Figuring out which variable
                                                    is X and which is Y
                                                    Before moving forward to find the equation for your regression line, you have
                                                    to identify which of your two variables is X and which is Y. When doing cor-
                                                    relations (as I explain earlier in this chapter), the choice of which variable is X
                                                    and which is Y doesn’t matter, as long as you’re consistent for all the data.   287
                                                    But when fitting lines and making predictions, the choice of X and Y does
                                                    make a difference.
                                                   So how do you determine which variable is which? In general, Y is the vari-
                                                    able that you want to predict, and X is the variable you are using to make that
                                                    prediction. In the earlier cricket chirps example, you are using the number of
                                                    chirps to predict the temperature. So in this case the variable Y is the tem-
                                                    perature, and the variable X is the number of chirps. Hence Y can be predicted
                                                    by X using the equation of a line if a strong enough linear relationship exists.
                                                    Statisticians call the X-variable (cricket chirps in my earlier example) the
                                                    explanatory variable, because if X changes, the slope tells you (or explains)
                                                    how much Y is expected to change in response. Therefore, the Y variable is
                                                    called the response variable. Other names for X and Y include the independent
                                                    and dependent variables, respectively.
                                                    Checking the conditions
                                                   In the case of two numerical variables, you can come up with a line that
                                                    enables you to predict Y from X, if (and only if) the following two conditions
                                                    from the previous sections are met:
                                                     ✓ The scatterplot must form a linear pattern.
                                                     ✓ The correlation, r, is moderate to strong (typically beyond 0.50 or –0.50).
                                                    Some researchers actually don’t check these conditions before making pre-
                                                    dictions. Their claims are not valid unless the two conditions are met.








                                                                                                                           3/25/11   8:13 PM
                             26_9780470911082-ch18.indd   287                                                              3/25/11   8:13 PM
                             26_9780470911082-ch18.indd   287
   298   299   300   301   302   303   304   305   306   307   308