Page 375 - Six Sigma Demystified
P. 375

Part 3  S i x   S i g m a  To o l S        355


                             Generally, the magnitude of the β coefficients does not provide an indication
                           of the significance or impact of the factor. In the example equation below, we
                           cannot say that factor A is more critical than factor C simply because β  is
                                                                                               1
                           larger than β  because the scaling (or unit of measure) of each factor may be
                                       3
                           different. Some software (such as Minitab) will provide the regression function
                           in coded form, in which case the coefficients are applied to coded values of the
                           factors (such as –1 and +1), allowing direct comparison to estimate the effects
                           of the factors.
                             Once we have constructed a model, there are a number of ways to check the
                           model, especially through the use of residuals analysis (discussed next) to look
                           for patterns.
                             A confidence interval for the regression line may be constructed to indicate
                           the quality of the fitted regression function. The confidence lines diverge at the
                           ends and converge in the middle, which may be explained in one of two ways:

                             1. The regression function for the fitted line requires estimation of two pa-
                                rameters: slope and y intercept. The error in estimating intercept provides
                                a gap in the vertical direction. The error in estimating slope can be visual-
                                ized by imagining the fitted line rotating about its middle. This results in
                                the hourglass-shaped region shown by the confidence intervals.
                             2. The center of the data is located near the middle of the fitted line. The
                                ability to predict the regression function should be better at the center of
                                the data; hence the confidence limits are narrower at the middle. The abil-
                                ity to estimate at the extreme conditions is much less, resulting in a wider
                                band at each end.

                             Don’t confuse the confidence interval on the line with a prediction interval

                           for new data. If we assume that the new data are independent of the data used
                           to calculate the fitted regression line, then a prediction interval for future obser-
                           vations depends on the error that is built into the regression model plus the
                           error associated with future data. While our best estimate for the y value based
                           on a given x value is found by solving the regression equation, we recognize that
                           there can be variation in the actual y values that will be observed. Thus the
                           shape of the prediction interval will be similar to that seen in the confidence
                           interval but wider.
                             Another useful statistic provided by the ANOVA table is the coefficient of
                                           2
                           determination (R ), which is the square of the Pearson correlation coefficient
                           R. R  varies between 0 and 1 and indicates the amount of variation in the data
                               2
                           accounted for by the regression model. In multiple regression, the Pearson cor-
   370   371   372   373   374   375   376   377   378   379   380