Page 367 - Six Sigma Demystified

P. 367

Part 3 S i x S i g m a To o l S 347

Simple linear regression analysis is often applied as part of the analysis done
with a scatter diagram (described below). Multiple regression is used when
there is more than one factor that influences the response. For example, cycle
time for a sales process may be affected by the number of items purchased and
the time of day. In this case, there are two independent variables: (1) number
of items purchased and (2) time of day. We also can estimate the interaction
between these factors. For example, perhaps the effect of time of day varies
depending on the number of items purchased. It may be that when only a few
items are purchased, the time of day makes a big difference in cycle time varia-
tion, yet when many items are purchased, time of day makes little difference
to cycle time variation.

Methodology

Simple Linear Regression
The regression model used for simple linear regression is that of a straight line.
You might recall this equations as y = m × x + b, where y is the dependent vari-
able, x is the independent variable, m is the slope, and b is the value of y when
x equals zero (b is sometimes called the intercept).

Y = β + β X + error
0 1
Another way to write this is using the Greek letter beta, as shown above. β
0
(“beta naught”) is used to estimate the intercept, and β (“beta one”) is used to
1
indicate the slope of the regression line. We show the equation using the Greek

letters because most statistical textbooks use this notation and it may be
expanded easily to the multiple regression case discussed below.
To define the equation, we need to estimate the two parameters—slope
and intercept. The statistical technique used most often is known as the
method of least squares, which will find values for β and β such that the fitted
0 1
line has a minimum squared distance from each of the experimental data
values.
The error term is an acknowledgment that even if we could sample all pos-
sible values, there most likely would be some unpredictability in the outcome.
This unpredictability could result from many possibilities, including measure-
ment error in either the dependent or independent variable, the effects of other
unknown variables, or nonlinear effects.

362 363 364 365 366 367 368 369 370 371 372