Page 378 - Six Sigma Demystified
P. 378

358        Six SigMa  DemystifieD


                          Minitab provides each of these stepwise regression techniques in the Stat\
                        Regression\Stepwise menu. To further test the fitted model, we should conduct
                        residuals analysis.



                        Residuals Analysis

                        Residuals analysis refers to a collection of techniques used to check the regres-
                        sion model for unnatural patterns, which may be indicative of an error in the
                        model. A residual is calculated as the difference between the observed value of
                        a response (the dependent variable) and the prediction for that response based
                        on the regression model and the actual value of the dependent variables.


                                                       e  = y  –  ˆ y i
                                                           i
                                                        i
                          Minitab and Excel will calculate a standardized residual. The effect of stan-
                        dardizing the residuals is to scale the error to achieve a variance of 1, which
                        helps to make outliers more prominent.
                          Standardized residual:

                                                        e /  s 2
                                                         i

                          There are a number of tools (discussed elsewhere in this part) that are useful
                        for finding patterns in residuals.


                          •  Normality test for residuals. If the error between the model and the data is
                             truly random, then the residuals should be normally distributed with a
                             mean of zero. A normal probability plot or goodness-of-fit test will allow

                             us to see departures from normality, indicating a poor fit of the model to
                             particular data. The normal probability plot for the residuals in the pre-
                             ceding “Regression Analysis” topic example provides a K-S test value of
                             0.991, indicating that the standardized residuals approximately fit a nor-
                             mal distribution, supporting the premise that the regression model fits the
                             data. The residual associated with observation 12 (marked in red in the
                             software) was significantly beyond the variation expected of a normal
                             distribution with mean of 0 and standard deviation of 1. Data far removed
                             from the “middle” of the data may have a large influence on the model
                             parameters (the	β coefficients, the adjusted correlation coefficient, and
                             the mean square error terms). When standardized residuals are greater
   373   374   375   376   377   378   379   380   381   382   383