Page 156 - Intermediate Statistics for Dummies
P. 156
12_045206 ch07.qxd 2/1/07 9:54 AM Page 135
Chapter 7: When Data Throws You a Curve: Using Nonlinear Regression
How large is too large? Typically, if you can’t fit the data by the time the
degree of the polynomial reaches three, then perhaps a different type of
model would work better. Or you may determine that you observe too
much scatter and haphazard behavior in the data to try to fit any model.
Minitab can do each of these steps for you up to degree two (step two); from
there, you need a more sophisticated statistical software program, such as
SAS or SPSS. However, most of the models you need to fit go up to the
second-degree polynomials. In the next section, you use a second-degree
polynomial to predict a student’s quiz score based on his or her study time.
Using a second-degree polynomial
to pass the quiz
The first step in fitting a polynomial model is to graph the data in a scatter-
plot and see whether the data fall into a particular pattern. Many different 135
types of polynomials exist to fit data that has a curved type of pattern. One
of the most common patterns found in curved data is the quadratic pattern,
or second-degree polynomial, which goes up and comes back down, or goes
down and comes back up, as the x values move from left to right (see Fig-
ure 7-2a). The second-degree (quadratic) polynomial is the simplest and most
commonly used polynomial beyond the straight line, so it deserves special
consideration.
This section is dedicated to looking at a second-degree polynomial. You can
see the exploratory process of graphing data and looking at the graph’s
shape by using the data involving quiz scores and study time. (After you
master the basic ideas based on second-degree polynomials, you can apply
them to polynomials with higher powers.)
Suppose 20 students take a statistics quiz. You record the quiz scores (which
have a maximum score of ten) and the number of hours students reported
studying for the quiz. (You can see the results in Figure 7-3.)
Looking at Figure 7-3, it appears that three camps of students are in this
class. Camp One, on the left end of the x-axis, understands the stuff (as
reflected in their higher scores) but didn’t have to study hardly at all
(because their study time on the x-axis is low). Camp Three also did very well
on the quiz (as indicated by their high quiz scores), but had to study a great
deal to get that grade (as seen on the far-right end of the x-axis). The stu-
dents in the middle, Camp Two, didn’t seem to fare well. All in all, from the
scatterplot in Figure 7-3, it does appear that study time may explain quiz
scores on some level, and explains it in a way indicative of a second-degree
polynomial. So a quadratic regression model may fit this data.