Page 367 - Six Sigma Demystified
P. 367

Part 3  S i x   S i g m a  To o l S        347


                             Simple linear regression analysis is often applied as part of the analysis done
                           with a scatter diagram (described below). Multiple regression is used when
                           there is more than one factor that influences the response. For example, cycle
                           time for a sales process may be affected by the number of items purchased and
                           the time of day. In this case, there are two independent variables: (1) number
                           of items purchased and (2) time of day. We also can estimate the interaction
                           between these factors. For example, perhaps the effect of time of day varies
                           depending on the number of items purchased. It may be that when only a few
                           items are purchased, the time of day makes a big difference in cycle time varia-
                           tion, yet when many items are purchased, time of day makes little difference
                           to cycle time variation.




                           Methodology


                           Simple Linear Regression
                           The regression model used for simple linear regression is that of a straight line.
                           You might recall this equations as y = m × x + b, where y is the dependent vari-
                           able, x is the independent variable, m is the slope, and b is the value of y when
                           x equals zero (b is sometimes called the intercept).

                                                      Y = β  + β X + error
                                                          0   1
                             Another way to write this is using the Greek letter beta, as shown above. β
                                                                                                 0
                           (“beta naught”) is used to estimate the intercept, and β  (“beta one”) is used to
                                                                            1
                           indicate the slope of the regression line. We show the equation using the Greek

                           letters  because  most  statistical  textbooks  use  this  notation  and  it  may  be
                           expanded easily to the multiple regression case discussed below.
                             To define the equation, we need to estimate the two parameters—slope
                           and intercept. The statistical technique used most often is known as the
                           method of least squares, which will find values for β  and β  such that the fitted
                                                                         0     1
                           line has a minimum squared distance from each of the experimental data
                           values.
                             The error term is an acknowledgment that even if we could sample all pos-
                           sible values, there most likely would be some unpredictability in the outcome.
                           This unpredictability could result from many possibilities, including measure-
                           ment error in either the dependent or independent variable, the effects of other
                           unknown variables, or nonlinear effects.
   362   363   364   365   366   367   368   369   370   371   372