Page 87 - Statistics II for Dummies
P. 87

Chapter 4: Getting in Line with Simple Linear Regression  71


                                This difference also makes sense from a statistical point. A prediction interval
                                has more variability than a confidence interval because it’s harder to make
                                a prediction about y for a single value of x* than it is to estimate the average
                                value of y for a given x*. (For example, individual test scores vary more than
                                average test scores do.) A prediction interval will be wider than a confidence
                                interval; it will have a larger margin of error.

                                A similarity between prediction intervals and confidence intervals is that
                                their margin of error formulas both contain x*, which means the margin of
                                error in either case depends on which value of x* you use. It turns out in
                                both cases that if you use the mean value of x as your x*, the margin of error
                                for each interval is at its smallest because there’s more data around the
                                mean of x than at any other value. As you move away from the mean of x, the
                                margin of error increases for each interval.



                      Checking the Model’s Fit (The Data,
                      Not the Clothes!)



                                After you’ve established a relationship between x and y and have come up
                                with an equation of a line that represents that relationship, you may think
                                your job is done. (Many researchers erringly stop here, so I’m depending on
                                you to break the cycle!) The most-important job remains to be completed:
                                checking to be sure that the conditions of the model are truly met and that
                                the model fits well in more specific ways than the scatterplot and correlation
                                measure (which I cover in the earlier section “Exploring Relationships with
                                Scatterplots and Correlations”).

                                This section presents methods for defining and assessing the fit of a simple
                                linear regression model.



                                Defining the conditions

                                Two major conditions must be met before you apply a simple linear
                                regression model to a data set:

                                  ✓ The y’s must have an approximately normal distribution for each value
                                    of x.
                                  ✓ The y’s must have a constant amount of spread (standard deviation) for
                                    each value of x.













          09_466469-ch04.indd   71                                                                   7/24/09   10:20:39 AM
   82   83   84   85   86   87   88   89   90   91   92