Page 263 - Computational Statistics Handbook with MATLAB

P. 263

Chapter 7: Data Partitioning 251

7.5 Jackknife-After-Bootstrap

In Chapter 6, we presented the bootstrap method for estimating the statistical
accuracy of estimates. However, the bootstrap estimates of standard error
and bias are also estimates, so they too have error associated with them. This
error arises from two sources, one of which is the usual sampling variability
because we are working with the sample instead of the population. The other
variability comes from the fact that we are working with a finite number B of
bootstrap samples.
We now turn our attention to estimating this variability using the jackknife-
after-bootstrap technique. The characteristics of the problem are the same as
,
,
in Chapter 6. We have a random sample x = ( x 1 … x n ) , from which we cal-
ˆ
θ
ˆ
θ
culate our statistic . We estimate the distribution of by creating B boot-
strap replicates θ ˆ *b . Once we have the bootstrap replicates, we estimate some
ˆ
θ
feature of the distribution of by calculating the corresponding feature of
the distribution of bootstrap replicates. We will denote this feature or boot-
ˆ . As we saw before, γ ˆ
strap estimate as γ B could be the bootstrap estimate of
B
the standard error, the bootstrap estimate of a quantile, the bootstrap esti-
mate of bias or some other quantity.
To obtain the jackknife-after-bootstrap estimate of the variability of γ ˆ B , we
ˆ i – ()
leave out one data point x i at a time and calculate γ B using the bootstrap
method on the remaining n – 1 data points. We continue in this way until we
ˆ i – () ˆ using the γ ˆ – ( 1)
B
have the n values of γ B . We estimate the variance of γ B val-
ues, as follows
n
ˆ ˆ n – 1 ˆ –() i ˆ ) 2
()
var Jack γ B = ------------ ∑ ( γ B – γ B , (7.21)
n
i = 1
where
n
1 ˆ –( i)
ˆ = --- ∑ .
γ B γ B
n
i = 1
Note that this is just the jackknife estimate for the variance of a statistic,
where the statistic that we have to calculate for each jackknife replicate is a
bootstrap estimate.
This can be computationally intensive, because we would need a new set
of bootstrap samples when we leave out each data point . There is a short-
x i
ˆ
cut method for obtaining var Jack γ ˆ () where we use the original B bootstrap
B
samples. There will be some bootstrap samples where the i-th data point does

258 259 260 261 262 263 264 265 266 267 268