Purpose
The purpose of to run the first example on jackknife estimation from chapter 10 of efrons book on bootstrap
> setwd("C:/Cauldron/garage/Volatility/Learn/IntroToBootstrap")
> library(bootstrap)
> data(patch) |
Jackknife procedure
N recomputations of the estimate with each time one of the data point removed
> mus <- apply(patch, 2, mean)
> theta <- mus[6]/mus[5]
> n <- dim(patch)[1]
> thetas <- numeric()
> for (i in 1:n) {
+ temp.data <- patch[-i, ]
+ mus.t <- apply(temp.data, 2, mean)
+ theta.t <- mus.t[6]/mus.t[5]
+ thetas[i] <- theta.t
+ }
> print(thetas)
[1] -0.05711856 -0.12849970 -0.02145610 -0.13245033 -0.05067038 -0.08404803
[7] -0.06486298 -0.02219698 |
jacknife bias
> bias.jack <- (n - 1) * (mean(thetas) - theta)
> print(bias.jack)
y
0.008002488 |
The estimate converges to the limiting bias estimate with just n recomputations. But the downside is that jackknife may not work always, especially if the statistical functional is not a smooth statistic
- Computation of variability of bias
> jackknife.bootstrap <- function() {
+ mus <- apply(patch, 2, mean)
+ theta <- mus[6]/mus[5]
+ n <- dim(patch)[1]
+ thetas <- numeric()
+ patch.temp <- patch[sample(1:n, n, T), ]
+ for (i in 1:n) {
+ temp.data <- patch.temp[-i, ]
+ mus.t <- apply(temp.data, 2, mean)
+ theta.t <- mus.t[6]/mus.t[5]
+ thetas[i] <- theta.t
+ }
+ bias.jack <- (n - 1) * (mean(thetas) - theta)
+ return(bias.jack)
+ }
> res <- c(replicate(200, jackknife.bootstrap())) |
> hist(res, breaks = 30, main = "bootstrap replications of jackknife bias",
+ xlim = c(-2, 2)) |
> jackknife.bootstrap.se <- function() {
+ mus <- apply(patch, 2, mean)
+ theta <- mus[6]/mus[5]
+ n <- dim(patch)[1]
+ thetas <- numeric()
+ patch.temp <- patch[sample(1:n, n, T), ]
+ for (i in 1:n) {
+ temp.data <- patch.temp[-i, ]
+ mus.t <- apply(temp.data, 2, mean)
+ theta.t <- mus.t[6]/mus.t[5]
+ thetas[i] <- theta.t
+ }
+ bias.jack.se <- ((n - 1)/n) * sqrt(sum((mean(thetas) - theta)^2))
+ return(bias.jack.se)
+ }
> res <- c(replicate(200, jackknife.bootstrap.se())) |
> hist(res, breaks = 30, main = "SE bootstrap replications of jackknife bias",
+ xlim = c(0, 0.8)) |
Jack knife standard error , tukeys formula is not always trusted but in the example mentioned in the book, it works out fine