Page 122 - Statistics II for Dummies
P. 122

106
                       Part II: Using Different Types of Regression to Make Predictions
                                  Other variables you may think of that are related to punt distance may
                                  include the direction and speed of the wind at the time of the punt, the angle
                                  at which the ball was snapped, the average distance of punts made in the
                                  past by a particular punter, whether the game is at home or away in a hostile
                                  environment, and so on. However, these researchers seem to have enough
                                  information on their hands to build a model to estimate punt distance.

                                  For the sake of simplicity, you can assume the kicker is right-footed, which
                                  isn’t always the case, but it represents the overwhelming majority of kickers.

                                  Looking just at this raw data set in Table 6-1, you can’t figure out which vari-
                                  ables, if any, are related to distance of the punt or how those variables may
                                  be related to punt distance. You need more analyses to get a handle on this.


                                  Examining scatterplots and correlations


                                  After you’ve identified a set of possible x variables, the next step is to find out
                                  which of these variables are highly related to y in order to start trimming
                                  down the set of possible candidates for the final model. In the punt distance
                                  example, the goal is to see which of the six variables in Table 6-1 are strongly
                                  related to punt distance. The two ways to look at these relationships are

                                   ✓ Scatterplot: A graphical technique
                                   ✓ Correlation: A one-number measure of the linear relationship between
                                      two variables
                                  Seeing relationships through scatterplots
                                  To begin examining the relationships between the x variables and y, you use
                                  a series of scatterplots. Figure 6-1 shows all the scatterplots — not only of
                                  each x variable with y but also of each x variable with the other x variables.
                                  The scatterplots are in the form of a matrix, which is a table made of rows
                                  and columns. For example, the first scatterplot in row two of Figure 6-1
                                  looks at the variables of distance (which appears in column one) and hang
                                  time (which appears in row two). This scatterplot shows a possible positive
                                  (uphill) linear relationship between distance and hang time.

                                  Note that Figure 6-1 is essentially a symmetric matrix across the diagonal
                                  line. The scatterplot for distance and hang time is the same as the scatterplot
                                  for hang time and distance; the x and y axes are just switched. The essential
                                  relationship shows up either way. So you only have to look at all the scat-
                                  terplots below the diagonal (where the variable names appear) or above the
                                  diagonal. You don’t need to examine both.












                                                                                                       7/23/09   9:27:03 PM
           11_466469-ch06.indd   106
           11_466469-ch06.indd   106                                                                   7/23/09   9:27:03 PM
   117   118   119   120   121   122   123   124   125   126   127