MRP (or RPP) with non-census variables

Posted on October 28, 2018 9:09 AM by Andrew

It seems to be Mister P week here on the blog . . .

A question came in, someone was doing MRP on a political survey and wanted to adjust for political ideology, which is a variable that they can’t get poststratification data for.

Here’s what I recommended:

If a survey selects on a non-census variable such as political ideology, or if you simply wish to adjust for it because of potential nonresponse bias, my recommendation is to do MRP on all these variables.

It goes like this: suppose y is your outcome of interest, X are the census variables, and z is the additional variable, in this example it is ideology. The idea is to do MRP by fitting a multilevel regression model on y given (X, z), then poststratify based on the distribution of (X, z) in the population. The challenge is that you don’t have (X, z) in the population; you only have X. So what you do to create the poststratification distribution of (X, z) is: first, take the poststratification distribution of X (known from the census); second, estimate the population distribution of z given X (most simply by fitting a multilevel regression of z given X from your survey data, but you can also use auxiliary information if available).

Yu-Sung and I did this a few years ago in our analysis of public opinion for school vouchers, where one of our key poststratification variables was religion, which we really needed to include for our analysis but which is not on the census. To poststratify, we first modeled religion given demographics—we had several religious categories, and I think we fit a series of logistic regressions. We used these estimated conditional distributions to fill out the poststrat table and then went from there. We never wrote this up as a general method, though.

7 thoughts on “MRP (or RPP) with non-census variables”

Eric on October 28, 2018 12:58 PM at 12:58 pm said:

Would it be reasonable to fit both multilevel regression models (y given (X, z) and z given X) at the same time in a single Stan run? Would there be any benefits?

Reply ↓
Jeff Lax on October 28, 2018 2:29 PM at 2:29 pm said:

Here is a paper doing it for party:
https://www.columbia.edu/~jrl2124/klp2_paper.pdf

Polarizing the Electoral Connection: Partisan Representation in Supreme Court Confirmation Politics
Author(s): Jonathan P. Kastellec, Jeffrey R. Lax, Michael Malecki, and Justin H. Phillips Source: The Journal of Politics, Vol. 77, No. 3 (July 2015)

And another:
https://www.columbia.edu/~jrl2124/partypurse.pdf

The first propagates the uncertainty in the first stage throughout; the second also propagates the uncertainty but not in the linked version.

Reply ↓
- Andrew on October 28, 2018 6:22 PM at 6:22 pm said:
  
  Hey, Jeff. Too bad we have to communicate with each other with a 6-month lag…
  
  Reply ↓
Mike H on October 29, 2018 7:45 AM at 7:45 am said:

Wrt religion, why model it with census data? Why not use the GSS instead? It includes questions about religion.

Reply ↓
- Andrew on October 29, 2018 8:10 AM at 8:10 am said:
  
  Mike:
  
  Using GSS for religion is fine but then you’ll need to do some modeling if you want to use it for MRP, because (a) sample size is way too small to take raw numbers and treat them as population proportions, (b) GSS doesn’t give you the state where the respondent lives, and (c) even if GSS did give state, the data for each state wouldn’t be representative as GSS uses cluster sampling.
  
  Reply ↓
Devin on October 30, 2018 12:57 PM at 12:57 pm said:

Two papers that address closely related problems are:

Leeman, Lucas, and Fabio Wasserfallen. 2017. “Extending the Use and Prediction Precision of Subnational Public Opinion Estimation.” American Journal of Political Science 61 (4): 1003–1022.

and

Caughey, Devin and Mallory Wang. Forthcoming. “Dynamic Ecological Inference for Time-Varying Population Distributions Based on Sparse, Irregular, and Noisy Marginal Data.” Political Analysis. https://www.dropbox.com/s/hdj1owa73x4jwx8/CaugheyWang-PopEst.pdf?dl=0

Reply ↓
JINTAO HUANG on October 28, 2025 6:00 PM at 6:00 pm said:

Hi,

I’m encountering similar problem and reading this great content! My questions is with the ideology or religion based on a survey estimation, what kind of assumption we’re making here? My understanding is that this is assuming the distribution of ideology or religion is the same between survey and population? Or by doing the multilevel regression, we’re actively engaging the non-response bias already.

I’m concerned if the actual covariate we’re trying to estimate is very different between the survey and population. I’m trying to understand how does the covariate imbalance adjusted in MrP

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

MRP (or RPP) with non-census variables

7 thoughts on “MRP (or RPP) with non-census variables”

Leave a Reply Cancel reply