Page 106 - Statistics II for Dummies
P. 106
90 Part II: Using Different Types of Regression to Make Predictions
Minitab can find a correlation matrix between any pairs of variables in the
model, including the y variable and all the x variables as well. To calculate a
correlation matrix for a group of variables in Minitab, first enter your data in
columns (one for each variable). Then go to Stat>Basic Statistics>Descriptive
Statistics>Correlation. Highlight the variables from the left-hand side for which
you want correlations, and click Select.
To find the values of the correlation matrix from the computer output, intersect
the row and column variables for which you want to find the correlation, and
the top number in that intersection is the correlation of those two variables.
For example, the correlation between TV ads and TV sales is 0.791, because
it intersects the TV row with the Sales column in the correlation matrix in
Figure 5-2.
Testing correlations for significance
By the rule-of-thumb approach from Stats I (also reviewed in Chapter 4),
a correlation that’s close to 1 or –1 (starting around ± 0.75) is strong; a
correlation close to 0 is very weak/nonexistent; and around ± 0.6 to 0.7, the
relationships become moderately strong. The correlation between TV ads
and TV sales of 0.791 indicates a fairly strong positive linear relationship
between these two variables, based on the rule-of-thumb. The correlation
between newspaper ads and TV sales seen in Figure 5-2 is 0.594, which is
moderate by my rule-of-thumb.
Many times in statistics a rule-of-thumb approach to interpreting a correlation
coefficient is sufficient. However, you’re in the big leagues now, so you need
a more precise tool for determining whether or not a correlation coefficient is
large enough to be statistically significant. That’s the real test of any statistic:
not that the relationship is fairly strong or moderately strong in the sample,
but whether or not the relationship can be generalized to the population.
Now, that phrase statistically significant should ring a bell. It’s your old friend
the hypothesis test calling to you (see Chapter 3 for a brush-up on hypoth-
esis testing). Just like a hypothesis test for the mean of a population or the
difference in the means of two populations, you also have a test for the cor-
relation between two variables within a population.
The null hypothesis to test a correlation is Ho: ρ = 0 (no relationship) versus
Ha: ρ ≠ 0 (a relationship exists). The letter ρ is the Greek version of r and rep-
resents the true correlation of x and y in the entire population; r is the correla-
tion coefficient of the sample.
✓ If you can’t reject Ho based on your data, you can’t conclude that the
correlation between x and y differs from zero, indicating you don’t have
evidence that the two variables are related and x shouldn’t be in the
multiple regression model.
10_466469-ch05.indd 90 7/24/09 9:32:33 AM