Page 135 - Intermediate Statistics for Dummies
P. 135

11_045206 ch06.qxd  2/1/07  9:52 AM  Page 114
                               114
                                         Part II: Making Predictions by Using Regression
                                                     4. Examine the p-values from each of the t-tests in step three (listed on
                                                        the Minitab output) and choose the smallest one.
                                                        The variable associated with that p-value is the best candidate to be
                                                        added to the model, because that variable is the most statistically signif-
                                                        icant of all the possible x variables at this point.
                                                     5. If the p-value for the x variable found in step four is smaller than the
                                                        prespecified α, add that x variable to the model.
                                                        After the first round, you have the model y = b 0 + b i x i where x i refers to
                                                        the first x variable you added to the model.
                                                     6. Repeat steps three through five, using the new model from step
                                                        five, and keep adding variables one at a time as long as the smallest
                                                        p-value of each round is less than the prespecified α = 0.05.
                                                        If the smallest p-value is larger than the prespecified α, don’t add any
                                                        more variables to the model and stop the forward selection process.
                                                        Your final model contains all of the x variables that were added during
                                                        each phase of the forward selection process.
                                                    To find a best multiple linear regression model by using the forward selection
                                                    procedure in Minitab, go to Stat>Regression>Stepwise. Highlight which vari-
                                                    able is the response (y) variable and click Select. This variable will show up
                                                    in the Response box. Then highlight which variables are the predictor (x)
                                                    variables and click Select. These variables will show up in the Predictor box.
                                                    Click on Methods, and click on Forward Selection. In the Alpha to Enter box,
                                                    put in your prespecified value of α you want to require to allow an x variable
                                                    to be included in the model. Typically statisticians would set this value at
                                                    between 0.05 and 0.10. (I use 0.05.) This prespecified α level is called the
                                                    entry level for the forward selection procedure. The higher the entry level,
                                                    the easier it is for a variable to be entered, but the greater chance that the
                                                    variable was just significant by random chance. (In the F-value box, the
                                                    default is 4.0, which should be fine. The F-value is beyond the scope of this
                                                    book in this context, although you do work with it when you do analysis of
                                                    variance — see Chapter 10. ) Click OK and you get the output from the for-
                                                    ward selection procedure.
                                                    You use a prespecified α level as the entry criteria for adding a variable
                                                    because it represents the chance of making a Type I error and inadvertently
                                                    putting in a variable based on your sample when it shouldn’t be included.
                                                    (See Chapter 3 for more on Type I errors.) You choose a small α level because
                                                    you don’t want to make it too easy to add a variable, because it increases the
                                                    chance of adding something that isn’t truly meaningful. (You have to put a lid
                                                    on it somehow!)
   130   131   132   133   134   135   136   137   138   139   140