Survey Statistics: individualism doesn’t work (even when weighted)

Posted on March 17, 2026 5:00 PM by shira

Last year we saw that individual-level loss may not be great for choosing models for MRP (“individualism doesn’t work”).

The typical machine learning looks at individual-level Loss(y_i, yhat_i).

But for MRP we care about population-level Loss(E[Y], E[yhat_i]) where E[Y] is the unknown population mean and E[yhat_i] is our MRP estimate.

Earlier this month we saw that the model that minimizes individual-level loss in the sample may not be the model that minimizes individual-level loss in the population:

Kuh et al. 2023 tried a weighted-to-the-population individual-level loss but saw this still ordered models quite differently from the population-level loss. So the issue isn’t just the weighting, it’s the aggregation.

Ok but with individual-level Loss(y_i, yhat_i) we have the ground truth y_i in our survey.

For population-level Loss(E[Y], E[yhat_i]) we don’t have the ground truth E[Y].

Kennedy et al. 2024 replace E[Y] with the classical poststratification estimate E[ybar_X] (see the post on poststratification). But this is minimized when the multilevel regression (“MR” of MRP) is ybar_X, a data summary rather than a regularized model. This may overfit to the survey data and generalize poorly. This is analogous to minimizing training error for individual-level loss, see ESL p.221:

As in ESL, Kennedy et al. 2024 handle this with cross-validation.

6 thoughts on “Survey Statistics: individualism doesn’t work (even when weighted)”

AAAnonymous on March 17, 2026 5:42 PM at 5:42 pm said:

(I just finished writing the following poem and saw the picture of the scenery accompanying this blogpost which I thought was very fitting. I hope it’s okay to share the poem here. Wonderful view!)

Perhaps the thing with poetry
Is that it can show what you can see
In the exact moment in which it might be
That the path to the view is momentarily free

Reply ↓
- shira on March 18, 2026 6:15 AM at 6:15 am said:
  
  AAAnonymous, I think this is the first poem I’ve seen in the comments ! It is most welcome. Thank you for sharing !
  
  Reply ↓
Andrew on March 17, 2026 7:33 PM at 7:33 pm said:

Shira:

I agree that this is an important and confusing issue. Wei Wang and I discussed this problem with measuring fit at the individual level in our 2015 paper, Difficulty of selecting among multilevel models using predictive accuracy.

Reply ↓
- shira on March 18, 2026 6:14 AM at 6:14 am said:
  
  Thanks, Andrew ! Yup, I discussed that paper in the linked post: https://statmodeling.stat.columbia.edu/2025/06/24/survey-statistics-poststratification/
  
  But does this paper discuss population or other aggregate-level errors ? I only saw discussion of individual-level loss, but maybe I missed something ?
  
  Reply ↓
Carlos Ungil on March 18, 2026 2:26 AM at 2:26 am said:

Square brackets are back in fashion, yay!

Reply ↓
- shira on March 18, 2026 6:14 AM at 6:14 am said:
  
  Haha, yes Carlos, at least until Andrew changes them to parentheses :) or maybe :]
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Survey Statistics: individualism doesn’t work (even when weighted)

6 thoughts on “Survey Statistics: individualism doesn’t work (even when weighted)”

Leave a Reply Cancel reply