Page 244 - Computational Statistics Handbook with MATLAB
P. 244

232                        Computational Statistics Handbook with MATLAB


                             model that has the best accuracy or lowest error. In this chapter, we use the
                             prediction error (see Equation 7.5) to measure the accuracy. One way to
                             assess the error would be to observe new data (average temperature and cor-
                             responding monthly steam usage) and then determine what is the predicted
                             monthly steam usage for the new observed average temperatures. We can
                             compare this prediction with the true steam used and calculate the error. We
                             do this for all of the proposed models and pick the model with the smallest
                             error. The problem with this approach is that it is sometimes impossible to
                             obtain new data, so all we have available to evaluate our models (or our sta-
                             tistics) is the original data set. In this chapter, we consider two methods that
                             allow us to use the data already in hand for the evaluation of the models.
                             These are cross-validation and the jackknife.
                              Cross-validation is typically used to determine the classification error rate
                             for pattern recognition applications or the prediction error when building
                             models. In Chapter 9, we will see two applications of cross-validation where
                             it is used to select the best classification tree and to estimate the misclassifica-
                             tion rate. In this chapter, we show how cross-validation can be used to assess
                             the prediction accuracy in a regression problem.
                              In the previous chapter, we covered the bootstrap method for estimating
                             the bias and standard error of statistics. The jackknife procedure has a similar
                             purpose and was developed prior to the bootstrap [Quenouille,1949]. The
                             connection between the methods is well known and is discussed in the liter-
                             ature [Efron and Tibshirani, 1993; Efron, 1982; Hall, 1992]. We include the
                             jackknife procedure here, because it is more a data partitioning method than
                             a simulation method such as the bootstrap. We return to the bootstrap at the
                             end of this chapter, where we present another method of constructing boot-
                             strap confidence intervals using the jackknife. In the last section, we show
                             how the jackknife method can be used to assess the error in our bootstrap
                             estimates.






                             7.2 Cross-Validation
                             Often, one of the jobs of a statistician or engineer is to create models using
                             sample data, usually for the purpose of making predictions. For example,
                             given a data set that contains the drying time and the tensile strength of
                             batches of cement, can we model the relationship between these two vari-
                             ables? We would like to be able to predict the tensile strength of the cement
                             for a given drying time that we will observe in the future. We must then
                             decide what model best describes the relationship between the variables and
                             estimate its accuracy.
                              Unfortunately, in many cases the naive researcher will build a model based
                             on the data set and then use that same data to assess the performance of the
                             model. The problem with this is that the model is being evaluated or tested


                            © 2002 by Chapman & Hall/CRC
   239   240   241   242   243   244   245   246   247   248   249