Page 227 - Computational Statistics Handbook with MATLAB
P. 227

214                        Computational Statistics Handbook with MATLAB

                             Most simulations would have M >  1000  , but M between 10,000 and 25,000 is
                             not uncommon. One important guideline for determining the number of tri-
                             als, is the purpose of the simulation. If the tail of the distribution is of interest
                             (e.g., estimating Type I error, getting p-values, etc.), then more trials are
                             needed to ensure that there will be a good estimate of that area.






                             6.4 Bootstrap Methods
                             The treatment of the bootstrap methods described here comes from Efron and
                             Tibshirani [1993]. The interested reader is referred to that text for more infor-
                             mation on the underlying theory behind the bootstrap. There does not seem
                             to be a consistent terminology in the literature for what techniques are con-
                             sidered bootstrap methods. Some refer to the resampling techniques of the
                             previous section as bootstrap methods. Here, we use bootstrap to refer to
                             Monte Carlo simulations that treat the original sample as the pseudo-popu-
                             lation or as an estimate of the population. Thus, in the steps where we ran-
                             domly sample from the pseudo-population, we now resample from the
                             original sample.
                              In this section, we discuss the general bootstrap methodology, followed by
                             some applications of the bootstrap. These include bootstrap estimates of the
                             standard error, bootstrap estimates of bias, and bootstrap confidence inter-
                             vals.



                                          raprap
                                    BB oooott
                                                        y
                                              MM eethodologthodolog
                             GGeeneralneral neral  B  Bo  ooot  tsst sstt traprapM  Me  ethodologthodology  yy
                             Ge
                             General
                             The bootstrap is a method of Monte Carlo simulation where no parametric
                             assumptions are made about the underlying population that generated the
                             random sample. Instead, we use the sample as an estimate of the population.
                                                                        ˆ
                                                                        F              has proba-
                             This estimate is called the empirical distribution   where each x i
                             bility mass 1 n⁄  . Thus, each x i   has the same likelihood of being selected in a
                                                  ˆ
                             new sample taken from  .
                                                  F
                                           ˆ
                              When we use   as our pseudo-population, then we resample with replace-
                                           F
                                                               ,
                                                                  ,
                             ment from the original sample  x =  ( x 1 … x n )  . We denote the new sample
                                                              ,
                                                                 ,
                                                             *
                                                                   *
                                                        *
                             obtained in this manner by  x =  ( x 1 … x n )  . Since we are sampling with
                             replacement from the original sample, there is a possibility that some points
                                will appear more than once in x *   or maybe not at all. We are looking at the
                             x i
                             univariate situation, but the bootstrap concepts can also be applied in the d-
                             dimensional case.
                              A small example serves to illustrate these ideas. Let’s say that our random
                                                                   ,,
                                                                       ,
                             sample consists of the four numbers  x =  ( 5 832)  . The following are pos-
                             sible samples x  *  , when we sample with replacement from  : x
                             © 2002 by Chapman & Hall/CRC
   222   223   224   225   226   227   228   229   230   231   232