Two-stage regression as an approximation to multilevel modeling

Jeff Lane writes:

I was just talking with Delia about two-stage regressions compared to multilevel analysis and we were looking at Two-Stage Regression and Multilevel Modeling: A Discussion of Several Papers for the Journal “Political Analysis” and the 2005 blog discussion, in which you posted the following response to someone struggling with choice of models:

Before doing multilevel modeling, I would do a so-called “two-stage regression”–that is, fitting the regression model separately in each country, then estimating level-2 effects by running regressions on the country-level regression. By fitting the models separately, you’re automatically allowing slopes as well as intercepts to vary by country. (If you then want to interpret the intercepts, you have to make sure that your level-1 predictors are coded so that their zero-levels are interpretable.)

It sounds like there are two sorts of things you’re interested in: the level-1 coefficients and how they vary by country, and the level-2 coefficients that describe variation between countries. To understand the level-1 coefs, I’d make a series of plots showing the ests and se’s for all the countries, for each of the level-1 predictors. The level-2 coefs should be easy enough to interpret. With the two-stage regression, the se’s on the level-2 coefs automatically account for the variation between countries.

OK, so what about multilevel modeling? MLM takes more effort; the payoff, compared to two-stage regression, comes when the level-1 coefficients in the individual countries cannot be estimated accurately–when their se’s are large compared to their unexplained variation (an issue we discuss a bit in this paper ). If that’s an issue then, yes, I’d recommend multilevel modeling as a way of better estimating the level-1 and level-2 regression coefficients.

It’s irrelevant whether your study includes all the countries of Europe, or just a subset. Multilevel modeling is fine in either case.

I [Jeff] am in similar waters with my data as the guy who wrote in and I’d like to do the suggested two-stage regression but it’s unclear to me what exactly each stage entails.
Does the first stage consist of 20 models–one for each country–that each include only the level-1 covariates? With these coefficients and SEs, the variation from one country to the next for a given level-1 effect is assessed and the researcher can then move on to the second stage in which the level-2 variables are regressed on the level-1 coefficients(?) or SEs(?) to see how well the level-2 variables explain the between-country variation in the level-1 effects. I’m thrown off by the line “then estimating level-2 effects by running regressions on the country-level regression” and can’t tell exactly what should be regressed on what.

My reply: Yes, the first stage is the regression in each country using only the level-1 covariates. You then take the coefs from this first stage (ignoring the se’s) and regress them on the country-level predictors. It’s possible to design more elaborate two-stage regression procedures (for example, partially weighting by the se’s from the first regression), but at that point it becomes a bit silly; at that point you might as well just fit a multilevel model. Conversely, even if you want to fit a multilevel model, it might be a good idea to fit the two-stage model first as a way of better understanding what’s going on in your data.