Page 83 - Statistics II for Dummies
P. 83
Chapter 4: Getting in Line with Simple Linear Regression 67
Another situation is where no data were collected near the value of x = 0;
interpreting the y-intercept at that point is not appropriate. For example,
using a student’s score on midterm 1 to predict her score on midterm 2,
unless the student didn’t take the exam at all (in which case it doesn’t count),
she’ll get at least some points.
Many times, however, the y-intercept is of interest to you and has a value
that you can interpret, such as when you’re talking about predicting coffee
sales using temperature for football games. Some games get cold enough
to have zero and subzero temperatures (like Packers games for example —
Go Pack Go!).
Suppose I collect data on ten of my students who recorded their study time
(in minutes) for a 10-point quiz, along with their quiz scores. The data have a
strong linear relationship by all the methods used in this chapter (for exam-
ple, refer to the earlier section “Exploring Relationships with Scatterplots and
Correlations”). I went ahead and conducted a regression analysis, and the
results are shown in Figure 4-3.
Because there are students who (heaven forbid!) didn’t study at all for the
quiz, the y-intercept of 3.29 points (where study time x = 0) can be inter-
preted safely. Its value is shown in the Coef column in the row marked
Constant (see the section “The y-intercept of the regression line” for more
information). The next step is to give a confidence interval for the y-intercept
of the regression line, where you can take conclusions beyond just this
sample of ten students.
The formula for a 1 – α level confidence interval for the y-intercept (a) of a
simple linear regression line is . The standard error, SE , is equal to
a
, where again the value
of t* comes from the t-distribution with n – 2 degrees of freedom whose area
to the right is equal to α ÷ 2. Using the output from Figure 4-3 and the t-table,
I’m 95 percent confident that the quiz score (y) for someone with a study
time of x = 0 minutes is 3.29 ± 2.306 * 0.4864, which is anywhere from 2.17
to 4.41, on average. Note that 2.306 comes from the t-table with 10 – 2 = 8
degrees of freedom and 0.4864 is the SE for the y-intercept from Figure 4-3.
(So studying for zero minutes for my quiz is not something to aspire to.)
By the way, to find out how much time studying affected the quiz score for
these students, you can get an estimate of the slope on the output from
Figure 4-3 that the coefficient for slope is 0.1793, which says each minute of
studying is related to an increase in score of 0.1793 of a point, plus or minus
the margin of error, of course. Or, 10 more minutes relates to 1.793 more
points. On a 10-point quiz, it all adds up!
09_466469-ch04.indd 67 7/24/09 10:20:38 AM