Page 396 - Six Sigma Demystified
P. 396

376        Six SigMa  DemystifieD

                        Interpretation

                        Correlation implies that as one variable changes, the other also changes. Al-
                        though this may indicate a cause-and-effect relationship, this is not always the
                        case because there may be a third characteristic (or many more) that actually
                        cause the noted effect of both characteristics.
                          Sometimes, though, if we know that there is good correlation between two
                        characteristics, we can use one to predict the other, particularly if one charac-
                        teristic is easy to measure and the other isn’t. For instance, if we prove that
                        weight gain in the first trimester of pregnancy correlates well with fetus devel-
                        opment, we can use weight gain as a predictor. The alternative would be expen-
                        sive tests to monitor the actual development of the fetus.
                          If the two characteristics are somehow related, the pattern formed by plot-
                        ting them in a scatter diagram will show clustering in a certain direction and
                        tightness. The more the cluster approaches a line in appearance, the more the
                        two characteristics are likely to be linearly correlated.
                          The relative correlation of one characteristic with another can be seen both
                        from how closely points cluster on the line and from the correlation coefficient
                        R. R values near 1 imply very high correlation between the dependent and
                        independent variables, meaning that a change in one characteristic will be
                        accompanied by a change in the other characteristic. Weak correlation means
                        that the variation in the dependent variable is not well explained by the changes
                        in the independent variable. This lack of correlation implies that other variables
                        (including measurement error) may be responsible for the variation in y.
                          Positive correlation means that as the independent variable increases, so does
                        the dependent variable, and this is shown on the scatter diagram as a line with
                        a positive slope. Negative correlation implies that as the independent variable

                        increases, the dependent variable decreases, and a negative slope is seen on the
                        scatter diagram.
                          Regression analysis (discussed earlier) provides a prediction model for the
                        relationship between the two variables.
                          Be careful not to extrapolate beyond the data region because you have no
                        experience on which to draw. Extrapolation should be done with great caution
                        because the relationship between the two variables may change significantly.
                        For example, the size of a balloon will increase as we pump more air into it,
                        but  only  to  a  point!  Once  we  pass  that  point,  the  outcome  will  change
                        dramatically.
                          When we suspect that there are hidden variables, we can stratify our data to
   391   392   393   394   395   396   397   398   399   400   401