Page 396 - Six Sigma Demystified
P. 396
376 Six SigMa DemystifieD
Interpretation
Correlation implies that as one variable changes, the other also changes. Al-
though this may indicate a cause-and-effect relationship, this is not always the
case because there may be a third characteristic (or many more) that actually
cause the noted effect of both characteristics.
Sometimes, though, if we know that there is good correlation between two
characteristics, we can use one to predict the other, particularly if one charac-
teristic is easy to measure and the other isn’t. For instance, if we prove that
weight gain in the first trimester of pregnancy correlates well with fetus devel-
opment, we can use weight gain as a predictor. The alternative would be expen-
sive tests to monitor the actual development of the fetus.
If the two characteristics are somehow related, the pattern formed by plot-
ting them in a scatter diagram will show clustering in a certain direction and
tightness. The more the cluster approaches a line in appearance, the more the
two characteristics are likely to be linearly correlated.
The relative correlation of one characteristic with another can be seen both
from how closely points cluster on the line and from the correlation coefficient
R. R values near 1 imply very high correlation between the dependent and
independent variables, meaning that a change in one characteristic will be
accompanied by a change in the other characteristic. Weak correlation means
that the variation in the dependent variable is not well explained by the changes
in the independent variable. This lack of correlation implies that other variables
(including measurement error) may be responsible for the variation in y.
Positive correlation means that as the independent variable increases, so does
the dependent variable, and this is shown on the scatter diagram as a line with
a positive slope. Negative correlation implies that as the independent variable
increases, the dependent variable decreases, and a negative slope is seen on the
scatter diagram.
Regression analysis (discussed earlier) provides a prediction model for the
relationship between the two variables.
Be careful not to extrapolate beyond the data region because you have no
experience on which to draw. Extrapolation should be done with great caution
because the relationship between the two variables may change significantly.
For example, the size of a balloon will increase as we pump more air into it,
but only to a point! Once we pass that point, the outcome will change
dramatically.
When we suspect that there are hidden variables, we can stratify our data to