Last year we saw that individual-level loss may not be great for choosing models for MRP (“individualism doesn’t work”).
The typical machine learning looks at individual-level Loss(y_i, yhat_i).
But for MRP we care about population-level Loss(E[Y], E[yhat_i]) where E[Y] is the unknown population mean and E[yhat_i] is our MRP estimate.

Earlier this month we saw that the model that minimizes individual-level loss in the sample may not be the model that minimizes individual-level loss in the population:

Kuh et al. 2023 tried a weighted-to-the-population individual-level loss but saw this still ordered models quite differently from the population-level loss. So the issue isn’t just the weighting, it’s the aggregation.
Ok but with individual-level Loss(y_i, yhat_i) we have the ground truth y_i in our survey.
For population-level Loss(E[Y], E[yhat_i]) we don’t have the ground truth E[Y].
Kennedy et al. 2024 replace E[Y] with the classical poststratification estimate E[ybar_X] (see the post on poststratification). But this is minimized when the multilevel regression (“MR” of MRP) is ybar_X, a data summary rather than a regularized model. This may overfit to the survey data and generalize poorly. This is analogous to minimizing training error for individual-level loss, see ESL p.221:

As in ESL, Kennedy et al. 2024 handle this with cross-validation.
(I just finished writing the following poem and saw the picture of the scenery accompanying this blogpost which I thought was very fitting. I hope it’s okay to share the poem here. Wonderful view!)
Perhaps the thing with poetry
Is that it can show what you can see
In the exact moment in which it might be
That the path to the view is momentarily free
AAAnonymous, I think this is the first poem I’ve seen in the comments ! It is most welcome. Thank you for sharing !
Shira:
I agree that this is an important and confusing issue. Wei Wang and I discussed this problem with measuring fit at the individual level in our 2015 paper, Difficulty of selecting among multilevel models using predictive accuracy.
Thanks, Andrew ! Yup, I discussed that paper in the linked post: https://statmodeling.stat.columbia.edu/2025/06/24/survey-statistics-poststratification/
But does this paper discuss population or other aggregate-level errors ? I only saw discussion of individual-level loss, but maybe I missed something ?
Square brackets are back in fashion, yay!
Haha, yes Carlos, at least until Andrew changes them to parentheses :) or maybe :]