Mister P wins again

Chad Kiewiet De Jonge, Gary Langer, and Sofi Sinozich write:

This paper presents state-level estimates of the 2016 presidential election using data from the ABC News/Washington Post tracking poll and multilevel regression with poststratification (MRP). While previous implementations of MRP for election forecasting have relied on data from prior elections to establish poststratification targets for the composition of the electorate, in this paper we estimate both turnout and vote preference from the same preelection poll. Through Bayesian estimation we are also able to capture uncertainty in both estimated turnout and vote preferences. This approach correctly predicts 50 of 51 contests, showing greater accuracy than comparison models that rely on the 2012 Current Population Survey Voting and Registration Supplement for turnout.

Cool. Also this:

While the model does not perfectly estimate turnout as a share of the voting age population, popular vote shares, or vote margins in each state, it is more accurate than predictions published by polling aggregators or other published MRP estimators.

And more:

The paper also reports how vote preferences changed over the course of the 18-day tracking period, compares subgroup-level estimates of turnout and vote preferences with the 2016 CPS Survey and National Election Pool exit poll, and summarizes the accuracy of the approach applied to the 2000, 2004, 2008, and 2012 elections.

Here are the headings of their results section:

Estimating Turnout from Pre-Election Polls Outperforms Models Based on Historical Data

MRP Based on Pre-Election Polling Anticipated Trump Victory; 2012 Turnout-Based Models Don’t

Model Estimates Suggest an Electorate Even More Polarized by Education than the Exit Poll

Clinton Consistently Led in the Popular Vote, but not in the Electoral Vote

MRP Outperforms Polling Aggregators in Accuracy

MRP Performs Fairly Well in Past Elections

They fit their models using Stan, as they explain in this footnote:

4 thoughts on “Mister P wins again

  1. I was interested to see that they published using the default priors from rstanarm. Do you support that? I recall in the documentation of brms (for example) Paul saying something like “we provide weakly informative priors by default but you are strongly encouraged to use priors that reflect your own beliefs”.
    I ask because I generally use the default priors in rstanarm and brms so long as everything is sampling nicely. But I always imagined when I came to publish an analysis, saying “I used the software’s default priors” might not cut it.

    • Kristian:

      In any case you should be able to do better than the default, but the default’s a good start. To put it another way: I prefer the default from rstanarm to the default uniform priors in glm, glmer, etc. And people use those default priors in published papers all the time.

Leave a Reply

Your email address will not be published. Required fields are marked *