Page 374 - Six Sigma Demystified
P. 374

354        Six SigMa  DemystifieD


                          The ANOVA table uses the F statistic to compare the variability accounted
                        for by the regression model with the remaining variation owing to error. The
                        null hypothesis is that the coefficients of the regression model are zero; the
                        alternative hypothesis is that at least one of the coefficients is nonzero and thus
                        provides some ability to estimate the response.
                          Although we could use the F-statistic tables in Appendices 4 and 5 to deter-
                        mine whether to accept or reject the hypothesis, most statistical software will
                        provide a p value for the F statistic to indicate the relevance of the model. Most
                        times we will reject the null hypothesis and assert that the calculated linear
                        regression model is significant when the p value is less than 0.10. (While a p
                        value of 0.05 or less is preferred to indicate significance, a value of 0.10 or less
                        is accepted in preliminary analyses. The assumption is that the parameter will
                        be retained so that additional analysis can better determine its significance.)
                          Bear in mind that statistical significance may or may not indicate physical
                        significance. If we measure the statistical significance of a given factor to a
                        response, this does not necessarily mean that the factor is in fact significant in
                        predicting the response in the real world. Factors may happen to vary coinci-
                        dent with other factors, some of which may be significant. For example, if we
                        estimate that shoe size is statistically significant in understanding the variation
                        in height, it does not mean that shoe size is a good predictor of height, nor
                        should it imply the causal relation that increasing shoe size increases height.
                          In this example, the calculated p value for the F test is 0.032, so the null
                        hypothesis that all the coefficients are zero is rejected.
                          The regression model shown in the Minitab and Excel analyses above is

                             Response = 45.6 + 0.288(factor A) – 0.380(factor B) + 0.0111(factor C)

                          The regression model represents our best estimate of future values of y based
                        on given values of each significant factor. For example, when there are 10 units
                        of factor A, 1 unit of factor B, and 100 units of factor C, the best estimate for
                        the response y is

                                 Response = 45.6 + 0.288(10) – 0.380(1) + 0.0111(100) = 49.21

                          Similarly, we could calculate values of the response y for any value of input
                        factors x. Recall that extrapolation beyond our data region should be done with
                        caution.
                          Each coefficient β  indicates the predicted change in y for a unit change in
                                           i
                        that x when all other terms are constant. For example, β  = 0.288 implies that
                                                                           1
                        the response increases by 0.288 units for each additional unit of factor A.
   369   370   371   372   373   374   375   376   377   378   379