Page 264 - Computational Statistics Handbook with MATLAB
P. 264

252                        Computational Statistics Handbook with MATLAB

                             not appear. Efron and Tibshirani [1993] show that if  n ≥  10  and  B ≥  20 , then
                                                                                               .
                             the probability is low that every bootstrap sample contains a given point  x i
                                                   ˆ  i – ()   by taking the bootstrap replicates for samples
                             We estimate the value of γ B
                                                             . These steps are outlined below.
                             that do not contain the data point x i
                             PROCEDURE - JACKKNIFE-AFTER-BOOTSTRAP

                                                                    ,
                                                                 ,
                                1.  Given a random sample  x =  ( x 1 … x n ) ,  calculate  a statistic of
                                   interest . θ ˆ
                                2. Sample with replacement from the original sample to get a boot-
                                                          ,
                                                       ,
                                                      *
                                   strap sample  x *b  =  (  x 1 … x n *  . )
                                3. Using the sample obtained in step 2, calculate the same statistic
                                   that was determined in step one and denote by  θ ˆ  *b  .
                                4. Repeat steps 2 through 3, B times to estimate the distribution of  . θ
                                                                                            ˆ
                                                                               θ
                                                                               ˆ
                                5. Estimate the desired feature of the distribution of   (e.g., standard
                                   error, bias, etc.) by calculating the corresponding  feature of the
                                   distribution of θ ˆ *b  . Denote this bootstrap estimated feature as γ ˆ  B .
                                                          ˆ           ,  ,
                                6. Now  get the error in  γ B  .  For  i =  1 … n , find all samples
                                           ,
                                              ,
                                          *
                                                *
                                   x  *b  =  ( x 1 … x n )    that do  not contain the point  x i  .  These are the
                                                                              ˆ  i – ()
                                   bootstrap samples that can be used to calculate  γ B  .
                                                                       ˆ
                                7. Calculate the estimate of the variance of  γ B   using Equation 7.21.
                             Example 7.9
                             In this example, we show how to implement the jackknife-after-bootstrap
                             procedure. For simplicity, we will use the MATLAB Statistics Toolbox func-
                             tion called bootstrp, because it returns the indices for each bootstrap sam-
                             ple and the corresponding bootstrap replicate θ ˆ *b  . We return now to the law
                             data where our statistic is the sample correlation coefficient. Recall that we
                                                                                          ˆ
                             wanted to estimate the standard error of the correlation coefficient, so γ B   will
                             be the bootstrap estimate of the standard error.
                                % Use the law data.
                                load law
                                lsat = law(:,1);
                                gpa = law(:,2);

                                % Use the example in MATLAB documentation.
                                B = 1000;
                                [bootstat,bootsam] = bootstrp(B,'corrcoef',lsat,gpa);
                             The output argument bootstat contains the B bootstrap replicates of the
                             statistic we are interested in, and the columns of bootsam contains the indi-
                             ces to the data points that were in each bootstrap sample. We can loop


                            © 2002 by Chapman & Hall/CRC
   259   260   261   262   263   264   265   266   267   268   269