## Fitting multilevel models using packages other than R and Bugs

Our new book on regression and multilevel models is written using R and Bugs. But we’d also like, in an appendix, to quickly show how to fit multilevel models using other software, including Stata, Sas, Spss, MLWin, and HLM (and others?). We’d really appreciate your help in getting sample code for these packages.

We’ve set up five simple examples. I’ll give them below, along with the calls in R that would be used to fit the models. We’re looking for the comparable commands in the other languages. We don’t actually need anyone to fit the models, just to give the code. This should be helpful for students who will be learning from our book using other software platforms.

Example 1: Varying-intercept linear regression with data “y”, predictor “x”, and grouping variable “group”: y_i = a_{group[i]} + b*x_i + error_i. (That is, “group” is an index variable taking on integer values 1 through J, where J is the number of groups.)

R code:
lmer (y ~ x + (1 | group))

Example 2: Same as Example 1, but with a group-level predictor “u” (a vector of length J); that is, the group-level model for the intercepts is a_j = gamma.0 + gamma.1*u_j + error_j

R code:
u.full <- u[group] lmer (y ~ x + u.full + (1 | group)) [We need to define u.full to make a predictor that is the same length as the data; as far as we know, the lmer() function in R does not take group-level predictors.] Example 3: Same as Example 2, but with varying intercepts and varying slopes: y_i = a_{group[i]} + b_{group[i]}*x_i + error_i, where the J pairs (a_j,b_j) follow a bivariate normal distribution with mean vector (gamma.0 + gamma.1*u[j], delta.0 + delta.1*u[j]) and unknown covariance matrix, with all parameters estimated from the data.

R code:
lmer (y ~ x + u.full + x:u.full + (1 + x | group))

[In the earlier version of this entry, I didn’t have x:u.full, but I need it here so that u is included as a predictor in the group-level regression.]

[Also, the “1” in the “1+x” above is not necessary; it’s actually implicit that the constant term is included. I kept the “1” in there just to be clear.]

Example 4: Go back to Example 1, but with binary data and logistic regression: Pr(y_i=1) = invlogit (a_{group[i]} + b*x_i)

R code:
lmer (y ~ x + (1 | group), family=binomial(link=”logit”))

Example 5: Go back to Example 1, but with count data and overdispersed Poisson regression with “offset” log(z): y_i ~ overdispersed Poisson (z_i*exp(a_{group[i]} + b*x_i)). [It’s important to have both the overdispersion and the offset because these are important components to realistic count-data models.]

R code:
log.z <- log(z) lmer (y ~ x + (1 | group), offset=log.z, family=quasipoisson(link="log")) Example 6: A two-way data structure with replication. For convenience, label the index variables for the groupings as “state” and “occupation”, so that the model is y_i = mu + alpha_{state[i]} + beta_{occupation[i]} + gamma_{state[i],occupation[i]} + error_i. We want the alpha’s, the beta’s, and the gamma’s to be modeled (each with their own hierarchical normal distribution); for simplicity, we assume no other predictors in the model.

R code:
state.occupation <- max(occupation)*(state - 1) + occupation lmer (y ~ 1 + (1 | state) + (1 | occupation) + (1 | state.occupation)) [The first line was needed to define an index variable that rasters over all the states and occupations.] These examples don't come close to exhausting the kinds of multilevel models that we are fitting--but we think they will be enough to get people started with fitting these methods with software other than R and Bugs. Thanks for your help! P.S.: I added the x:u.full interaction in Example 3.

P.P.S.: Test files and examples are here.

1. jerry says:

As far as Stata is concerned, try to post on their mailing-list: http://www.stata.com/statalist/ You could also try to contact StataCorp directly: http://www.stata.com/company/contact.html They will certainly be glad to help you.

2. Alex Tabarrok says:

For Stata just get Multilevel and Longitudinal Modeling using Stata by Rabe-Hesketh and Skrondal.

Best

Alex

3. J Bravo says:

I agree. Rabe-Hesketh' & Skrondal wrote a very nice book on the matter, "Multilevel and Longitudinal Modeling Using Stata". See also their "Generalized Latent Variable Modeling" book.

4. js says:

A section of the MLwiN website contains reviews of multilevel modeling in a number of other packages, usually with sample code: Software Reviews. I'm not sure how closely any of their examples will match yours, but it might be a good place to look.

5. Eduardo Leoni says:

<pre>
Stata can do all of these models (except poisson with extra variation) even without using gllamm.

Draft Stata code (I would have to double check some of those.)

Example 1:
xtreg y x, i(group)
Stata 9 has another option:
xtmixed y x || group:

Example 2:
xtreg y x u.full, i(group) re
Stata 9 has another option:
xtmixed y x u.full || group:

Example 3:
Stata 9 code:
xtmixed y x u.full || group: x, cov(unstruct)

Example 4:
xtlogit y x, i(group)

Example 5:
no idea. Would negative binomial do?
xtnbreg y x, i(group) re offset(log.z)

Example 6:
I am not so sure about this one, I didn't fully understand the model. I would try:

xtmixed y || _all: R.state || _all:R.occupation
</pre>

6. Ming says:

How to get the predicted value and parameters after fitting a model using lmer()?

7. Andy says:

I'm currently trying to use lmer to model some psychological data. I'm confused by (a) how to judge the overall fit of the various models, (b) how to do stepwise stuff (is that statistically evil?), (c) how to judge in a standardized fashion the contribution of each of the predictors.

Do you have any examples which hint at where I should be looking?