Question 3
What is the relation between correlation between 2 independent variables in a dataset and the correlation the parameter estimates of the 2 independent variables ?
> library(faraway) > data(savings) > fit <- lm(sr ~ pop15 + pop75 + dpi + ddpi, savings) > cor(savings$pop15, savings$pop75) [1] -0.9084787 > summary(fit, corr = TRUE)$corr (Intercept) pop15 pop75 dpi ddpi (Intercept) 1.0000000 -0.9841640 -0.80911114 -0.1658813 -0.18826530 pop15 -0.9841640 1.0000000 0.76535591 0.1799079 0.10246580 pop75 -0.8091111 0.7653559 1.00000000 -0.3670459 -0.05472238 dpi -0.1658813 0.1799079 -0.36704594 1.0000000 0.25548434 ddpi -0.1882653 0.1024658 -0.05472238 0.2554843 1.00000000 |
we see that the correlation between the estimates of pop15 and po75 is 0.765.
The correlation between predictors and the correlation between the coefficients of those predictors are often different in sign. Loosely speaking, two positively correlated predictors will attempt to perform the same job of explanation. The more work one does, the less the other needs to do and hence a negative correlation in the coefficients. For the negatively correlated predictors, as seen here, the effect is reversed.
VERY IMPORTANT IS TESTING ONE’S UNDERSTANDING ABOUT CORRELATION!!