Wald Test in R
via – Sixth Degree
In 2004, I attended the ICPSR summer program in Quantitative Methods in Social Research in Ann Arbor Michigan. I went there to learn a bit more about the fundamentals of Maximum Likelihood Estimation. The course was taught by Professor Charles Franklin from the University of Wisconsin. It was a great course with morning lectures and afternoons filled with hands-on exercises. The summer program in that year was also my first experience with the statistical programming language R. The wide-ranging possibilities of R got me hooked ever since. Recently, even The New York Times wrote a piece on the open-source project, which R is. You can read that piece here.
Anyway, Franklin used R to instruct the students on how to estimate Maximum Likelihood models. He did not do so relying on already existing software routines, but he had written an R-library to do the estimation of many different Maximum Likelihood models. His library included:
(Heteroskedastic) Normal Regression;
Probit Regression;
Logit Regression;
Log-log Regression;
Complementary Log-log Regression;
Cauchy Regression;
Ordered Probit;
Ordered Logit;
Multinomial Logit;
Marginal effects routines for the aforementioned models; and
A function to do a Wald test on Maximum Likelihood objects in R.
It was this last routine that I needed recently.
In class, Professor Franklin had described what he called “The Trinity of Classic Tests”, namely: The Likelihood Ratio test, the Wald test, and the Lagrange Multiplier test. The second is used to test whether hypothesized constraints are consistent with some unconstrained maximum likelihood model. For example, it allows for testing whether two parameters in a regression model are equivalent and that is what I tried to do recently.
However, I was not playing around with classical Maximum Likelihood models. Instead I was estimating a so-called mixed (or “hierarchical”, or “multilevel”) model using the lme4 package developed by Douglas Bates and others. Many of these mixed models are estimated (due to the fact that no closed-form solution exists) using algorithms that only approximate the log-likelihood. Nevertheless, I wanted to test the equivalence of several parameters in my model, and thought of the R-library written by Franklin.
Unfortunately, it wasn’t straightforward to apply his routine for the Wald test to my problem at hand. Apparently, over the years, several R commands had changed, and consequently, Franklin’s code did not work out-of-the-box (not that it was ever packed in a box, but that is what’s so great about open-source software). Backward compatibility was clearly out of the question, so I had to dig into Franklin’s code and solve some issues. I only made some minor changes to apply his original code (which he had of course written to match his code for estimating the models above) to my mixed model. So, for those of you who are happy to find the piece of code below…it is Charles Franklin that deserves the credit.
Here’s the code that resulted after solving the issues due to lack of backward compatibility:
Wald<-function(object,R,q) {
if (!is.matrix(R)) stop(”Restrictions must be a matrix”)
b<-fixef(object)
vc<-vcov(object)
w<-t(R%*%b-q)%*%solve(R%*%vc%*%t(R))%*%(R%*%b-q)
pw<-1-pchisq(w[1],length(q))
cat(”*************\n* Wald Test *\n*************\n”)
cat(”lme4 fixed effects:\n”)
print(fixef(object))
cat(”\nRestrictions:\n”)
print(R)
cat(”\nq = “,q)
cat(”\nChi-square:”, round(w[1],3),” df = “,length(q))
cat(”\nProb x>chisq:”, round(pw,5),”\n”)
}
OK, fine, but how to use this Wald function?
Here’s an example (for which I also use the mlmRev package):
require(lme4)
require(mlmRev)
names(Exam)
summary(model.test<-lmer(normexam~1|school)+sex+schgend,data=Exam))
R <- matrix(c(0,0,1,-1),1,4,byrow=T)
q<-c(0)
Wald(model.test,R,q)
This leads to the following output:
> require(lme4)
require(mlmRev) names(Exam) [1] “school” “normexam” “schgend” “schavg” “vr” “intake” [7] “standLRT” “sex” “type” “student” summary(model.test<-lmer(normexam~1|school)+sex+schgend,data=Exam)) Linear mixed model fit by REML Formula: normexam ~ (1 | school) + sex + schgend Data: Exam AIC BIC logLik deviance REMLdev 10992 11029 -5490 10967 10980 Random effects: Groups Name Variance Std.Dev. school (Intercept) 0.16365 0.40454 Residual 0.83970 0.91635 Number of obs: 4059, groups: school, 65
Fixed effects: Estimate Std. Error t value (Intercept) 0.03347 0.07502 0.446 sexM -0.26175 0.04163 -6.288 schgendboys 0.19063 0.15426 1.236 schgendgirls 0.12244 0.12154 1.007
Correlation of Fixed Effects: (Intr) sexM schgndb sexM -0.286 schgendboys -0.409 -0.131 schgendgrls -0.617 0.177 0.253
R <- matrix(c(0,0,1,-1),1,4,byrow=T) q<-c(0) Wald(model.test,R,q) ************* * Wald Test * ************* lme4 fixed effects: (Intercept) sexM schgendboys schgendgirls 0.03347093 -0.26175366 0.19062984 0.12243517
Restrictions: [,1] [,2] [,3] [,4] [1,] 0 0 1 -1
q = 0 Chi-square: 0.16 df = 1 Prob x>chisq: 0.68934
As you can see, the test result shows no significant (p=.68934) difference between the two parameters for “schgendboys” and “schgendgirls”.
In this example, I only tested one constraint, but the Wald function can easily be used to test several constraints simultaneously. Testing multiple constraints simultaneously is implemented by extending the matrix R by adding more rows and by extending the vector q by adding more elements. Both need to be specified before running the Wald funtion.
Basically, in the R matrix (R stands for restrictions, at least in this case!) you have to set up the matrix of linear constraints. An equality constraint is implemented by using a 1 for a specific coefficient and a -1 for the other coefficient. Every row needs the same number of elements as there are fixed effects in the lmer model (four in the example). Because I wanted to test whether the exam scores of both boy schools and girl schools deviate equally from the mixed schools (which is the reference category in the model), I set up an equality constraint for the last two fixed effects. In the q vector, I specified only one zero (0), because I wanted to test the equality constraint against the null-hypothesis that there is no (0=zero) difference. The use of 1 and -1 in the R matrix for testing an equality constraint follows from the fact that B1=B2 equals 1*B1 -1*B2 = 0.
If one would want to test whether some effect is equal in size but of opposite sign, one should simply use 1’s for both parameters instead of 1 and -1 in the R matrix.