>> Home >> Bookstore >> Data resampling >> An Introduction to the Bootstrap

An Introduction to the Bootstrap

Bradley Efron and Robert J. Tibshirani
Publisher: Chapman & Hall/CRC
Copyright: 1993
ISBN-13: 978-0-412-04231-7
Pages: 436; hardcover
Price: $89.75

Comment from the Stata technical group

In 1979, Bradley Efron revolutionized the field of statistics with his invention of the bootstrap, which he introduced to the world with his paper in the Annals of Statistics. The bootstrap broadly refers to a continually growing collection of methodologies in which data are resampled to incorporate into statistical inference the information contained in the data regarding their probability distribution. Conceptually simple yet computationally intense, the bootstrap owes much of its rise in popularity over the last 20 years to the advent of the personal computer over the same period. As computers become faster and more powerful, the bootstrap becomes a more practical and indispensible tool for the data analyst.

This text, while a complete reference on the topic, is fairly nonmathematical in its treatment of the bootstrap in all its forms. As such, it is accessible not only to statisticians but to persons in all fields interested in inferring conclusions from their data. The book begins with a conceptual discussion of the accuracy of a sample mean and proceeds in the first few chapters to cover a few pertinent basics of probability theory and the properties of the empirical distribution as an estimate of the true cumulative distribution. The distinction between the nonparametric bootstrap and the parametric bootstrap is then discussed, and the impact of the number of bootstrap samples on the estimated standard errors is assessed.

The middle chapters of the text explore bootstrapping different data structures, issues unique to regression models, bootstrap estimates of bias, the jackknife, several forms of bootstrap confidence intervals, permutation tests, hypothesis testing, estimates of prediction error, and using the bootstrap to find an optimal smoothing parameter in nonparametric regression.

The last chapters of the text are more mathematical and serve to provide the theoretical backbone for much of the material presented earlier. Of particular interest to epidemiologists and others in related fields is chapter 25, which covers FDA bioequivalence and shows how bootstrapping can tackle the important issues of power and sample size.

Table of contents

1 Introduction
1.1 An overview of this book
1.2 Information for instructors
1.3 Some of the notation used in the book
2 The accuracy of a sample mean
2.1 Problems
3 Random samples and probabilities
3.1 Introduction
3.2 Random samples
3.3 Probability theory
3.4 Problems
4 The empirical distribution function and the plug-in principle
4.1 Introduction
4.2 The empirical distribution function
4.3 The plug-in principle
4.4 Problems
5 Standard errors and estimated standard errors
5.1 Introduction
5.2 The standard error of a mean
5.3 Estimating the standard error of the mean
5.4 Problems
6 The bootstrap estimate of standard error
6.1 Introduction
6.2 The bootstrap estimate of standard error
6.3 Example: the correlation coefficient
6.4 The number of bootstrap replications B
6.5 The parametric bootstrap
6.6 Bibliographic notes
6.7 Problems
7 Bootstrap standard errors: some examples
7.1 Introduction
7.2 Example 1: test score data
7.3 Example 2: curve fitting
7.4 An example of bootstrap failure
7.5 Bibliographic notes
7.6 Problems
8 More complicated data structures
8.1 Introduction
8.2 One-sample problems
8.3 The two-sample problem
8.4 More general data structures
8.5 Example: lutenizing hormone
8.6 The moving blocks bootstrap
8.7 Bibliographic notes
8.8 Problems
9 Regression models
9.1 Introduction
9.2 The linear regression model
9.3 Example: the hormone data
9.4 Application of the bootstrap
9.5 Bootstrapping pairs vs bootstrapping residuals
9.6 Example: the cell survival data
9.7 Least median of squares
9.8 Bibliographic notes
9.9 Problems
10 Estimates of bias
10.1 Introduction
10.2 The bootstrap estimate of bias
10.3 Example: the patch data
10.4 An improved estimate of bias
10.5 The jackknife estimate of bias
10.6 Bias correction
10.7 Bibliographic notes
10.8 Problems
11 The jackknife
11.1 Introduction
11.2 Definition of the jackknife
11.3 Example: test score data
11.4 Pseudo-values
11.5 Relationship between the jackknife and bootstrap
11.6 Failure of the jackknife
11.7 The delete-d jackknife
11.8 Bibliographic notes
11.9 Problems
12 Confidence intervals based on bootstrap "tables"
12.1 Introduction
12.2 Some background on confidence intervals
12.3 Relation between confidence intervals and hypothesis tests
12.4 Student's t interval
12.5 The bootstrap-t interval
12.6 Transformations and the bootstrap-t
12.7 Bibliographic notes
12.8 Problems
13 Confidence intervals based on bootstrap percentiles
13.1 Introduction
13.2 Standard normal intervals
13.3 The percentile interval
13.4 Is the percentile interval backwards?
13.5 Coverage performance
13.6 The transformation-respecting property
13.7 The range-preserving property
13.8 Discussion
13.9 Bibliographic notes
13.10 Problems
14 Better bootstrap confidence intervals
14.1 Introduction
14.2 Example: the spatial test data
14.3 The BCa method
14.4 The ABC method
14.5 Example: the tooth data
14.6 Bibliographic notes
14.7 Problems
15 Permutation tests
15.1 Introduction
15.2 The two-sample problem
15.3 Other test statistics
15.4 Relationship of hypothesis tests to confidence intervals and the bootstrap
15.5 Bibliographic notes
15.6 Problems
16 Hypothesis testing with the bootstrap
16.1 Introduction
16.2 The two-sample problem
16.3 Relationship between the permutation test and the bootstrap
16.4 The one-sample problem
16.5 Testing multimodality of a population
16.6 Discussion
16.7 Bibliographic notes
16.8 Problems
17 Cross-validation and other estimates of prediction error
17.1 Introduction
17.2 Example: hormone data
17.3 Cross-validation
17.4 Cp and other estimates of prediction error
17.5 Example: classification trees
17.6 Bootstrap estimates of prediction error
17.6.1 Overview
17.6.2 Some details
17.7 The .632 bootstrap estimator
17.8 Discussion
17.9 Bibliographic notes
17.10 Problems
18 Adaptive estimation and calibration
18.1 Introduction
18.2 Example: smoothing parameter selection for curve fitting
18.3 Example: calibration of a confidence point
18.4 Some general considerations
18.5 Bibliographic notes
18.6 Problems
19 Assessing the error in bootstrap estimates
19.1 Introduction
19.2 Standard error estimation
19.3 Percentile estimation
19.4 The jackknife-after-bootstrap
19.5 Derivations
19.6 Bibliographic notes
19.7 Problems
20 A geometrical representation for the bootstrap and jackknife
20.1 Introduction
20.2 Bootstrap sampling
20.3 The jackknife as an approximation to the bootstrap
20.4 Other jackknife approximations
20.5 Estimates of bias
20.6 An example
20.7 Bibliographic notes
20.8 Problems
21 An overview of nonparametric and parametric inference
21.1 Introduction
21.2 Distributions, densities and likelihood functions
21.3 Functional statistics and influence functions
21.4 Parametric maximum likelihood inference
21.5 The parametric bootstrap
21.6 Relation of parametric maximum likelihood, bootstrap and jackknife approaches
21.6.1 Example: influence components for the mean
21.7 The empirical cdf as a maximum likelihood estimate
21.8 The sandwich estimator
21.8.1 Example: Mouse data
21.9 The delta method
21.9.1 Example: delta method for the mean
21.9.2 Example: delta method for the correlation coefficient
21.10 Relationship between the delta method and infinitesimal jackknife
21.11 Exponential families
21.12 Bibliographic notes
21.13 Problems
22 Further topics in bootstrap confidence intervals
22.1 Introduction
22.2 Correctness and accuracy
22.3 Confidence points based on approximate pivots
22.4 The BCa interval
22.5 The underlying basis for the BCa interval
22.6 The ABC approximation
22.7 Least favorable families
22.8 The ABCq method and transformations
22.9 Discussion
22.10 Bibliographic notes
22.11 Problems
23 Efficient bootstrap computations
23.1 Introduction
23.2 Post-sampling adjustments
23.3 Application to bootstrap bias estimation
23.4 Application to bootstrap variance estimation
23.5 Pre-and post-sampling adjustments
23.6 Importance sampling for tail probabilities
23.7 Application to bootstrap tail probabilities
23.8 Bibliographic notes
23.9 Problems
24 Approximate likelihoods
24.1 Introduction
24.2 Empirical likelihood
24.3 Approximate pivot methods
24.4 Bootstrap partial likelihood
24.5 Implied likelihood
24.6 Discussion
24.7 Bibliographic notes
24.8 Problems
25 Bootstrap bioequivalence
25.1 Introduction
25.2 A bioequivalence problem
25.3 Bootstrap confidence intervals
25.4 Bootstrap power calculations
25.5 A more careful power calculation
25.6 Fieller’s intervals
25.7 Bibliographic notes
25.8 Problems
26 Discussion and further topics
26.1 Discussion
26.2 Some questions about the bootstrap
26.3 References on further topics
Appendix: software for bootstrap computations
Some available software
S language functions
Author index
Subject index
The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube