Introduction to Bootstrap Methods with Applications to R

It was bootstrapping that made me start off on my statistics journey years ago. I have very fond memories of the days when I could understand simple things in statistics without resorting to complicating looking formulae. A few lines of code were all that was needed. Slowly I became more curious about many things in statistics and that’s how my love affair with stats began. There are two bibles that any newbie to bootstrap should go over; one by Efron & Tibshirani and the other by Davison & Hinkley. Any other specific topics, you can always understand by reading papers. It is always a nice feeling for me to read stuff about bootstrapping. However reading this book was an extremely unpleasant experience.

In the recent years with the rise of R, many authors have started writing books such as “Introduction to ____( fill in any statistical technique that you want to) using R”. With many more people adopting R, these books hope to fill the need of a data analyst who might not be willing immerse himself/herself in to the deep theories behind a technique. The target audience might want some package that can be used to crunch out numbers. Fair enough. Not everyone has the time and inclination to know the details. There are some amazing books that fill this need and do it really well. Sadly, this book is not in that category. Neither does it explains the key functions for using bootstrapping nor does it explain the code that has been sprinkled in the book. So, the R in the title is definitely a misleading one. Instead of talking about the nuances of the various functions based on author’s experience, all one gets to see is some spaghetti code in the book. I can’t imagine an author using 15 pages of the book (that too within a chapter and not the appendix) in listing various packages that have some kind of bootstrap function. That’s exactly the authors of this book have done. Insane! This book gives a historical perspective of various developments around bootstrapping techniques. You can’t learn anything specific from the book. It just gives a 10000 ft. overview of various aspects of bootstrapping. I seriously do not understand why the authors has even written this book. My only purpose in writing this review is to dissuade others from reading this book and wasting their time and money.

Introduction

The bootstrap is one of the number of techniques that are a part of a broad umbrella of nonparametric statistics that are commonly called resampling methods. It was the article by Brad Efron in 1979 that started it all. The impact of this important publication can be gauged by the following statement in Davison and Hinkley’s book :

The idea of replacing complicated and often inaccurate approximations to biases, variances and other measures of uncertainty by computer simulation caught the imagination of both theoretical researchers and users of statistical methods

Efron’s motivation was to construct a simple approximation to Jackknife procedure that was initially developed by John Tukey. Permutation methods were known since 1930s but they were ineffective beyond small samples. Efron connected bootstrapping techniques to the then available jackknife, delta method, cross validation and permutation tests. He was the first to show that bootstrapping was a real competitor to jackknife and delta method for estimating the standard error of an estimator. Throughout 1980s to 1990s, there was an explosion of papers on this subject. Bootstrap was being used for confidence intervals, hypothesis testing and more complex problems. In 1983, Efron wrote a remarkable paper that showed that bootstrap worked better than crossvalidation in classification problems of a certain kind. While these positive developments were happening, by 1990s, there were also papers that showed bootstrap estimates were not consistent in specific settings. The first published example of an inconsistent bootstrap estimate appeared in 1981. By the year 2000, there were quite a few articles that showed that bootstrapping could be a great tool to estimate various functions but it can also be inconsistent. After this brief history on bootstrapping, the chapter goes in to defining some basic terms and explaining four popular method; jackknife, delta method, cross validation and subsampling. Out of all the packages mentioned in the chapter (that take up 15 pages), I think all one needs to tinker around to understand basic principles are boot and bootstrap

Estimation

This chapter talks about improving the point estimation via bootstrapping. Historically speaking, the bootstrap method was looked at, to estimate the standard error of an estimate and later for a bias adjustment. The chapter begins with a simple example where bootstrap can be used to compute the bias of an estimator. Subsequently a detailed set of examples of using bootstrapping to improve cross validation estimate are given. These examples show that there are many instances where Bootstrapped crossvalidation technique gives a better performance than using other estimators like CV, 632 and e0 estimators. About estimating a location parameter for a random variable from a particular distribution, MLE does a great job and hence one need not really use bootstrapping. However there are cases where MLE estimates have no closed form solutions.In all such cases, one can just bootstrap away to glory. In the case of linear regression, there are two ways in which bootstrapping can be used. The first method involves residuals. Bootstrap the residuals and create a set of new dependent variables. These dependent variables can be used to form a bootstrapped sample of regression coefficients. The second method is bootstrapping pairs. It involves sampling pairs of dependent and independent variable and computing the regression coefficients. Between these two methods, the second method is found to be more robust to model misspecification.

Some of the other uses of bootstrapping mentioned in the chapter are:

Dealing with heteroskedastic errors by using wild bootstrap
Nonlinear regression
Non parametric regression
Application to CART family (bagging, boosting and random forests)

My crib about this chapter is this : You are introducing data mining techniques like LDA, QDA, bagging etc. in a chapter where the reader is supposed to get an intuition about how bootstrapping can be used to get a point estimate. Who is the target audience for this book ? A guy who is already familiar with these data mining techniques would gloss over the stuff as there is nothing new for him. A newbie would be overwhelmed by the material. For a guy who is not a newbie and who is not a data mining person, the content will be appear totally random . Extremely poor choice of content for an introductory book.

Confidence Intervals

One of the advantages of generating bootstrapped samples is that they can be used to construct confidence intervals. There are many ways to create confidence intervals. The chapter discusses bootstrap-t, iterated bootstrap, BC, BCa and tiled bootstrap. Again I don’t expect any newbie to understand clearly these methods after reading this chapter. All the author has managed to do is to give a laundry list of methods and give some description about the methods.And Yes, an extensive set of references that makes you feel that you are reading a paper and not a book. If you want to really understand these methods, the bibles mentioned at the beginning are the right sources.

Hypothesis testing

For simple examples, hypothesis testing can be done based on the confidence intervals obtained via bootstrap samples. There are subtle aspects that one needs to take care of, such as sampling from the pooled data etc. Amazing that the author doesn’t even provide some sample code to illustrate this point. The code that’s provided does sampling from individual samples. Instead code should have been provided to illustrate sampling from pooled data. Again poor choice on the way to present the content.

Time Series

The chapter gives a laundry list of bootstrap procedures in the context of time series; model based bootstrap, non overlapping block bootstrap, circular bootstrap, stationary bock bootstrap, tapered block bootstrap, dependent wild bootstrap,sieve bootstrap. Again a very cursory treatment and references to a whole lot of papers and books. The authors get it completely wrong. In an introductory book, there must be R code, there must be some simple examples to illustrate the point. Instead if you see a lot of references to papers and journal articles, the reader is going to junk this book and move on

Bootstrap variants

The same painful saga continues. The chapter gives a list of techniques – Bayesian bootstrap, Smoothed bootstrap, Parametric bootstrap, Double bootstrap, m-out-of-n bootstrap, and wild bootstrap. There is no code whatsoever to guide the reader. The explanation given to introduce these topics is totally inadequate.

When the bootstrap is inconsistent and How to remedy it ?

This chapter gives a set of scenarios when the bootstrap procedure can fail

For small sample sizes less than 10, bootstrapped sample is not reliable
Distributions that have infinite second moments
Estimating extreme values
Unstable AR processes
Long memory processes

Takeaway:

This is the worst book that I have read in the recent times. The authors are trying to cash in on the popularity of R. The title of the book is completely misleading. Neither is it an introduction to bootstrap methods nor is it an introduction to R methods for bootstrapping. All it does is give a cursory and inadequate treatment to the bootstrap technique. Do not buy or read this book. Total waste of time and money.