Page 123 - Intermediate Statistics for Dummies
P. 123

10_045206 ch05.qxd  2/1/07  9:50 AM  Page 102
                               102
                                         Part II: Making Predictions by Using Regression
                                         Checking the Fit of the Model
                                                    Before you run to your boss in triumph saying you’ve slam-dunked the ques-
                                                    tion of how to estimate plasma TV sales, you first have to make sure all your
                                                    i’s are dotted and all your t’s are crossed, as you do with any other statistical
                                                    procedure. In this case, you have to check the conditions of the multiple
                                                    regression model. These conditions mainly focus on the residuals (the differ-
                                                    ence between the estimated values for y and the observed values of y from
                                                    your data). If the model is close to the actual data you collected, you can
                                                    feel somewhat confident that if you collected more data, it would fall in line
                                                    with the model as well, and your predictions shouldn’t be too bad.
                                                    In this section, you see what the conditions are for multiple regression, and
                                                    specific techniques statisticians use to check each of those conditions. The
                                                    main character in all of this condition checking is the residual.
                                                    Noting the conditions
                                                    The conditions for multiple regression concentrate on the error terms, or resid-
                                                    uals. The residuals are the amount that’s left over after the model has been fit.
                                                    They represent the difference between the actual value of y observed in the
                                                    data set and the estimated value of y based on the model. The conditions of
                                                    the multiple regression model are the following (note that all need to be met in
                                                    order to give the go-ahead for a multiple regression model):
                                                       The residuals have a normal distribution with mean zero.
                                                       The residuals have the same variance for each fitted (predicted)
                                                        value of y.
                                                       The residuals are independent (don’t affect each other).
                                                    Plotting a plan to check the conditions
                                                    It may sound like you have a ton of things to check here and there, but luck-
                                                    ily, Minitab gives you all the info you need to know in a series of four graphs,
                                                    all presented at one time. These plots are called the residual plots, and they
                                                    graph the residuals against the values of a normal distribution to see whether
                                                    the normality condition fits.
   118   119   120   121   122   123   124   125   126   127   128