OLS vs GLS
Purpose
The purpose of this script is to explore the difference between OLS and GLS Standard errors
When errors are correlated , you can calculate estimates using three ways
- OLS ignoring the fact the errors are correlated
- OLS with correction to the standard errors
- GLS
Lets start with a simple model y.t = 1 + x.t + u.t where Xt Unif(0,1) and u N(0,xt^alpha)
Method 1
This is the method where OLS is used ignoring the fact that error are correlated.
> results <- data.frame() > for (i in 1:500) { + alpha <- 0.5 + x.t <- runif(100) + u.t <- rnorm(100, 0, xt^alpha) + y.t <- 1 + x.t + u.t + fit <- lm(y.t ~ x.t) + fit.sum <- summary(fit) + results <- rbind(results, fit.sum$coefficients[, 1]) + } > colnames(results) <- c("intercept", "slope") |
Method 2
This is the method where the correct OLS is used , meaning that the estimate of the covariance matrix is taken from the residuals.
> results2 <- data.frame() > for (i in 1:500) { + alpha <- 0.5 + x.t <- runif(100) + u.t <- rnorm(100, 0, xt^alpha) + y.t <- 1 + x.t + u.t + fit <- lm(y.t ~ x.t) + fit.sum <- summary(fit) + X <- cbind(rep(1, 100), xt) + xtx.inv <- solve(crossprod(X, X)) + error.sq <- as.vector(y.t - X %*% fit.sum$coefficients[, + 1])^2 + cov.sample <- diag(error.sq) + var.beta <- diag(xtx.inv %*% t(X) %*% cov.sample %*% X %*% + xtx.inv) + results2 <- rbind(results2, var.beta) + } > colnames(results2) <- c("intercept", "slope") |
Method 3
This uses actual GLS formula to calculate the estimates
> results3 <- data.frame() > for (i in 1:500) { + alpha <- 0.5 + x.t <- runif(100) + u.t <- rnorm(100, 0, xt^alpha) + cov.error <- diag(100, xt^alpha) + sm <- chol(cov.error) + smi <- solve(t(sm)) + y.t <- 1 + x.t + u.t + y.t.2 <- matrix((smi %*% y.t), ncol = 1) + x.t.2 <- matrix((smi %*% x.t), ncol = 1) + fit <- lm(y.t.2 ~ x.t.2) + fit.sum <- summary(fit) + results3 <- rbind(results3, fit.sum$coefficients[, 1]) + } > colnames(results3) <- c("intercept", "slope") > final <- data.frame(rbind(rbind(sd(results), sqrt(mean(results2))), + sd(results3))) > rownames(final) <- c("OLS-Incorrect", "OLS - Correct", "GLS") > final intercept slope OLS-Incorrect 0.13651642 0.2427060 OLS - Correct 0.13989072 0.3111644 GLS 0.01481729 0.2460846 |
HC0 is the most commonly used form of the HCCM and is referred to variously as the White, Eicker, or Huber estimator. As shown by White (1980) and others, HC0 is a consistent estimator.OLS-corrected is HC0 MacKinnon and White (1985) considered three alternative estimators designed to improve the small sample properties of HC0. HC1 = HC0*n/n-k HC2 — Use e2 / ( 1- hi) where hi is the hat value HC3 — Use e2 / ( 1- hi)^2 where hi is the hat value
It seems that HC3 is the best way to correct heteroscedasticity