Page 248 - Computational Statistics Handbook with MATLAB
P. 248

236                        Computational Statistics Handbook with MATLAB


                              The prediction error is defined as

                                                                  ˆ
                                                              ( [
                                                      PE =  E Y –  Y) 2  , ]                (7.5)
                             where the expectation is with respect to the true population. To estimate the
                             error given by Equation 7.5, we need to test our model (obtained from poly-
                             fit) using an independent set of data that we denote by  x i ' y i ',(  )  . This means
                             that we would take an observed  x i ' y i ',(  )   and obtain the estimate of y i '   using
                                                                                        ˆ
                             our model:

                                                                 ˆ
                                                             ˆ
                                                        ˆ
                                                        y i ' =  β 0 +  β 1x i  . '         (7.6)
                                                                                              ˆ
                                             ˆ
                             We then compare y i '   with the true value of y '  . Obtaining the outputs or y i '
                                                                     i
                             from the model is easily done in MATLAB using the polyval function as
                             shown in Example 7.2.
                              Say we have m independent observations  x ' y ',(  )   that we can use to test
                                                                      i  i
                             the model. We estimate the prediction error (Equation 7.5) using
                                                             m
                                                      ˆ    1        ˆ  2
                                                     PE =  ---- ∑ ( y i ' –  y i ')  .      (7.7)
                                                           m
                                                            i =  1
                             Equation 7.7 measures the average squared error between the predicted
                             response obtained from the model and the true measured response. It should
                             be noted that other measures of error can be used, such as the absolute differ-
                             ence between the observed and predicted responses.


                             Example 7.2
                             We now show how to estimate the prediction error using Equation 7.7. We
                             first choose some points from the steam data set and put them aside to use
                             as an independent test sample. The rest of the observations are then used to
                             obtain the model.

                                load steam
                                % Get the set that will be used to
                                % estimate the line.
                                indtest = 2:2:20; % Just pick some points.
                                xtest = x(indtest);
                                ytest = y(indtest);
                                % Now get the observations that will be
                                % used to fit the model.
                                xtrain = x;
                                ytrain = y;
                                % Remove the test observations.


                            © 2002 by Chapman & Hall/CRC
   243   244   245   246   247   248   249   250   251   252   253