OLS vs GLS

Purpose
The purpose of this script is to explore the difference between OLS and GLS Standard errors

When errors are correlated , you can calculate estimates using three ways

OLS ignoring the fact the errors are correlated
OLS with correction to the standard errors
GLS

Lets start with a simple model y.t = 1 + x.t + u.t where Xt Unif(0,1) and u N(0,xt^alpha)

Method 1
This is the method where OLS is used ignoring the fact that error are correlated.

> results <- data.frame()
> for (i in 1:500) {
+     alpha <- 0.5
+     x.t <- runif(100)
+     u.t <- rnorm(100, 0, xt^alpha)
+     y.t <- 1 + x.t + u.t
+     fit <- lm(y.t ~ x.t)
+     fit.sum <- summary(fit)
+     results <- rbind(results, fit.sum$coefficients[, 1])
+ }
> colnames(results) <- c("intercept", "slope")

Method 2
This is the method where the correct OLS is used , meaning that the estimate of the covariance matrix is taken from the residuals.

> results2 <- data.frame()
> for (i in 1:500) {
+     alpha <- 0.5
+     x.t <- runif(100)
+     u.t <- rnorm(100, 0, xt^alpha)
+     y.t <- 1 + x.t + u.t
+     fit <- lm(y.t ~ x.t)
+     fit.sum <- summary(fit)
+     X <- cbind(rep(1, 100), xt)
+     xtx.inv <- solve(crossprod(X, X))
+     error.sq <- as.vector(y.t - X %*% fit.sum$coefficients[,
+         1])^2
+     cov.sample <- diag(error.sq)
+     var.beta <- diag(xtx.inv %*% t(X) %*% cov.sample %*% X %*%
+         xtx.inv)
+     results2 <- rbind(results2, var.beta)
+ }
> colnames(results2) <- c("intercept", "slope")

Method 3
This uses actual GLS formula to calculate the estimates

> results3 <- data.frame()
> for (i in 1:500) {
+     alpha <- 0.5
+     x.t <- runif(100)
+     u.t <- rnorm(100, 0, xt^alpha)
+     cov.error <- diag(100, xt^alpha)
+     sm <- chol(cov.error)
+     smi <- solve(t(sm))
+     y.t <- 1 + x.t + u.t
+     y.t.2 <- matrix((smi %*% y.t), ncol = 1)
+     x.t.2 <- matrix((smi %*% x.t), ncol = 1)
+     fit <- lm(y.t.2 ~ x.t.2)
+     fit.sum <- summary(fit)
+     results3 <- rbind(results3, fit.sum$coefficients[, 1])
+ }
> colnames(results3) <- c("intercept", "slope")
> final <- data.frame(rbind(rbind(sd(results), sqrt(mean(results2))),
+     sd(results3)))
> rownames(final) <- c("OLS-Incorrect", "OLS - Correct", "GLS")
> final
               intercept     slope
OLS-Incorrect 0.13651642 0.2427060
OLS - Correct 0.13989072 0.3111644
GLS           0.01481729 0.2460846

HC0 is the most commonly used form of the HCCM and is referred to variously as the White, Eicker, or Huber estimator. As shown by White (1980) and others, HC0 is a consistent estimator.OLS-corrected is HC0 MacKinnon and White (1985) considered three alternative estimators designed to improve the small sample properties of HC0. HC1 = HC0*n/n-k HC2 — Use e2 / ( 1- hi) where hi is the hat value HC3 — Use e2 / ( 1- hi)^2 where hi is the hat value

It seems that HC3 is the best way to correct heteroscedasticity