Page 227 - Computational Statistics Handbook with MATLAB
P. 227
214 Computational Statistics Handbook with MATLAB
Most simulations would have M > 1000 , but M between 10,000 and 25,000 is
not uncommon. One important guideline for determining the number of tri-
als, is the purpose of the simulation. If the tail of the distribution is of interest
(e.g., estimating Type I error, getting p-values, etc.), then more trials are
needed to ensure that there will be a good estimate of that area.
6.4 Bootstrap Methods
The treatment of the bootstrap methods described here comes from Efron and
Tibshirani [1993]. The interested reader is referred to that text for more infor-
mation on the underlying theory behind the bootstrap. There does not seem
to be a consistent terminology in the literature for what techniques are con-
sidered bootstrap methods. Some refer to the resampling techniques of the
previous section as bootstrap methods. Here, we use bootstrap to refer to
Monte Carlo simulations that treat the original sample as the pseudo-popu-
lation or as an estimate of the population. Thus, in the steps where we ran-
domly sample from the pseudo-population, we now resample from the
original sample.
In this section, we discuss the general bootstrap methodology, followed by
some applications of the bootstrap. These include bootstrap estimates of the
standard error, bootstrap estimates of bias, and bootstrap confidence inter-
vals.
raprap
BB oooott
y
MM eethodologthodolog
GGeeneralneral neral B Bo ooot tsst sstt traprapM Me ethodologthodology yy
Ge
General
The bootstrap is a method of Monte Carlo simulation where no parametric
assumptions are made about the underlying population that generated the
random sample. Instead, we use the sample as an estimate of the population.
ˆ
F has proba-
This estimate is called the empirical distribution where each x i
bility mass 1 n⁄ . Thus, each x i has the same likelihood of being selected in a
ˆ
new sample taken from .
F
ˆ
When we use as our pseudo-population, then we resample with replace-
F
,
,
ment from the original sample x = ( x 1 … x n ) . We denote the new sample
,
,
*
*
*
obtained in this manner by x = ( x 1 … x n ) . Since we are sampling with
replacement from the original sample, there is a possibility that some points
will appear more than once in x * or maybe not at all. We are looking at the
x i
univariate situation, but the bootstrap concepts can also be applied in the d-
dimensional case.
A small example serves to illustrate these ideas. Let’s say that our random
,,
,
sample consists of the four numbers x = ( 5 832) . The following are pos-
sible samples x * , when we sample with replacement from : x
© 2002 by Chapman & Hall/CRC