Page 305 - Statistics for Dummies
P. 305

Chapter 18: Looking for Links: Correlation and Regression
                                                    y-intercept that can be calculated using formulas (and, I may add, these for-
                                                    mulas aren’t too hard to calculate).

                                                    To save a great deal of time calculating the best fitting line, first find the “big
                                                    five,” five summary statistics that you’ll need in your calculations:
                                                      1.  The mean of the x values (denoted  )


                                                      2.  The mean of the y values (denoted  )
                                                      3.  The standard deviation of the x values (denoted s )

                                                                                                   x

                                                      4.  The standard deviation of the y values (denoted s )
                                                                                                   y

                                                      5.  The correlation between X and Y (denoted r)
                                                    Finding the slope
                                                    The formula for the slope, m, of the best-fitting line is
                                                    where r is the correlation between X and Y, and s  and s  are the standard   289
                                                                                              x     y
                                                    deviations of the x-values and the y-values, respectively. You simply divide s
                                                                                                                      y
                                                    by s  and multiply the result by r.
                                                       x
                                                    Note that the slope of the best-fitting line can be a negative number because
                                                    the correlation can be a negative number. A negative slope indicates that the
                                                    line is going downhill. For example, an increase in police officers is related
                                                    to a decrease in the number of crimes in a linear fashion; the correlation and
                                                    hence the slope of the best-fitting line is negative in this case.
                                                    The correlation and the slope of the best-fitting line are not the same. The for-
                                                    mula for slope takes the correlation (a unitless measurement) and attaches
                                                    units to it. Think of s  ÷ s  as the variation (resembling change) in Y over
                                                                     y   x
                                                    the variation in X, in units of X and Y. For example, variation in temperature
                                                    (degrees Fahrenheit) over the variation in number of cricket chirps (in 15
                                                    seconds).
                                                    Finding the y-intercept
                                                    The formula for the y-intercept, b, of the best-fitting line is   , where
                                                      and   are the means of the x-values and the y-values, respectively, and m is
                                                    the slope (the formula for which is given in the preceding section).
                                                   So to calculate the y-intercept, b, of the best-fitting line, you start by finding
                                                    the slope, m, of the best-fitting line using the steps listed in the preceding sec-
                                                    tion. You then multiply m by   and subtract your result from  .




                                                                                                                           3/25/11   8:13 PM
                             26_9780470911082-ch18.indd   289
                             26_9780470911082-ch18.indd   289                                                              3/25/11   8:13 PM
   300   301   302   303   304   305   306   307   308   309   310