Should we trust inferences from multilevel models?

Markus Loecher asks some questions that come up on occasion:

I am in the middle of going through your book on multilevel/hierarchical models. . . hope to apply them to some multi-scale spatial problems that I am working on. I am still struggling with some of the more subtle implications of the mixed effects model and the “partial pooling” approach which even when formulated more along frequentist terminology seems to have a distinct Bayesian flavor to it ?
One particular source of confusion to me is Figure 12.1
While I understand the problems with estimates that rely on small sample sizes and also find the shrunken estimates in Fig. 12.1b appealing, I cannot overcome my feeling that this “dilemma” should be taken care of and expressed in the respective wider confidence intervals.
Pooling across counties seems like quite a strong assumption on exchangeability ?
A more mundane question is: how did you get the confidence intervals for sample sizes of 1 ?

My response:

1. Frequentist, Bayesian: they’re just words. What matters is what you’re doing. That’s why we talk about partial pooling, ‘cos that’s what we’re doing to the data. If you push me on it, though, yeah, the entire book is Bayesian.

2. You’d like the intervals in Figure 12.1b to be wider. Here’s a way to think of it: suppose the data we saw were a random sample from a huge dataset, with thousands of measurements per county. Now suppose you wanted to use the no-pooling or the multilevel estimates as a way of making a prediction for the average of the thousands of measurements in each county. Finally suppose that you want to give each estimate a conf interval, so that in each case there’s a 68% chance that the estimate contains the true value. Then you’d find (assuming the model is correct) that, indeed, the no-pooling estimates would require those really wide confidence intervals (as in Fig 12.1a), but the multilevel estimates would only need narrower intervals (as in Fig 12.1b).

3. “Exchangeability” refers to statistical methods (or, more generally, mathematical formulas) that treat the different groups (in this case, counties) symmetrically. The no-pooling, multilevel, and complete-pooling procedures are all exchangeable. So if you don’t like exchangeability, there’s no reason to pick on the multilevel model. More to the point, the multilevel model allows you to easily go beyond exchangeability by adding group-level predictors as illustrated later on in the chapter.

4. Even when sample size is 1 (or 0), you can get confidence intervals–the info comes from the group-level model.