Here it is. The model is vaguely based on our past work on Bayesian combination of state polls and election forecasts but with some new twists.
And, check it out: you can download our R and Stan source code and the data! Merlin Heidemanns wrote much of the code, which in turn is based on Pierre Kremp’s model from 2016.
We use multilevel modeling to partially pool information from economic and political “fundamentals,” state polls, and national polls.
Here’s an Economist article giving more background on our approach.
Congratulations, from what I gather this is amazing work
Are the samples in the battleground states large enough to forecast their voting behavior?
Mikem:
Sure, the state polls have big samples. Follow the link and you can see the data.
Do you age out older polls?
Paul:
We don’t age anything out, but we have time series models at the national and state levels, so when a poll is far from election day, it is partly explained in the model by transient errors.
(quick warning: trust me, I’m an engineer. So really, I’m not an expert on any of the points I’m raising but instead looking at this from a very large distance)
Looking at the first graph you posted, it appears that popular vote is adopting rather slowly to a change in polls. Will this change as election day comes closer? Otherwise, I’d expect the model to very much miss changes in the last week before election.
So I’d very much favor a more aggressive/volatila model, at least as a second option to choose from (or even an ensemble of models voting on possible outcomes).
That said, I did play around a little bit with different approaches for poll averaging (kalman filters and exponential increasing weights) when looking at data for the 2017 Germany election. In the end, this all didn’t really improve results compared to a simple moving average compared to what the graphs in this wikipedia article are doing:
https://en.wikipedia.org/wiki/Opinion_polling_for_the_next_German_federal_election
Daniel:
In our approach, we don’t think of the forecast in terms of how we average the data. Instead, we set up a generative model for the data and use Bayesian inference to learn about the parameters. Typically, inferences won’t be affected by one single poll. Any given poll can be off. There’s non-sampling error. But if there are a bunch of polls indicating a change in opinion, the inferences will reflect this. Our model and code are open source. You can play around with simulated data and run our model and see what you get.
Do you have a similar model for electoral college totals? For better or worse, this total is the one that counts, not the popular vote.
The answer to my own question is yes, as I saw when clicking through to the Economist site.
While a model agnostic to what happened in the middle using data t days before election with t as an input feature can produce just fine outputs, I’m not clear on how it would produce the, say September, projection in your figure. The predictions of the future forecasts? Because the today-forecast is completely marginalized there is basically no variability (the expected future data has already been integrated)? Trying to make sure I understand why the estimate and uncertainty doesn’t change over the future projection. I notice that the future estimates are missing from the economist presentation.
Ryan:
I’m guessing that the squiggling of those lines is just coming from simulation variability.
I just thought I’d take this opportunity to promote my own upcoming election model, which is a “combination of state polls and election forecasts,” which I achieve entirely by multiplying Nate Silver’s forecasts by the identity matrix and adding .01 to the probability of the person running behind and subtracting .01 from the probability of the person ahead. As soon as fivethirtyeight begins posting their model, I’ll be releasing some code you can run in your phone’s calculator app automatically computes my model for you! :)
*This is for the popular vote only. My model for the electoral college is a binomial model consisting of a series of 50 coin flips.
“Slightly less than the other guy is projecting” is a forecasting technique I use all the time at work, with pretty good results.
How concerned are you about using economic “fundamentals” in the model?
Normally, I would have zero problems with it, but there isn’t much historical precedence for extrapolating 14% unemployment to “fundamental”.
Also, these numbers will probably look better (but still bad) by the election.
Andrew –
Are you going to be going something on state polling?
Any thoughts you’d care to share on Sam Wang’s stuff at Princeton Election Consortium?
Or Scott Armstrong’s stuff at Pollyvote?
Joshua:
As noted above, our model includes state polls.
Andrew –
Oops.
Sorry. When I read “combination” I figured they were folded into a national-level prediction – didn’t realize they were broken out b
Off topic, but on topic for the blog generally – you might want to comment on this paper:
Identifying airborne transmission as the dominant route for the spread of COVID-19, Zhang et al, PNAS 2020.
https://www.pnas.org/content/pnas/early/2020/06/10/2009637117.full.pdf
This analysis ignores the number of tests performed and seasonality, and for china it ignores that all the patients and healthcare workers started getting vitamin c in February.
This is getting snarked on on Twitter: https://twitter.com/KateGrabowski/status/1271542363257573377 I have *not* evaluated it carefully, but they point to Figure 3, which really does look pretty appalling if it can be taken at face value … https://www.pnas.org/content/pnas/early/2020/06/10/2009637117/F3.large.jpg?width=800&height=600&carousel=1
Yes, that’s amusing, the downward trend regression line post-implementation of face covering can easily be projected back at least ten days before face covering was implemented.
Andrew’s favorite! Regression discontinuity!
oops, only five days before…face coverings are less retroactive than previously thought! :)
You just spoiled my day by linking to this. I was going to print it out just so I could tear it up but that might boost their metrics. Face masks were never used by any significant proportion of the Australian or New Zealand population. NZ has achieved eradication and Australia almost has.
Still never met anyone who answered a poll.
Andrew –
How to you view the issue of how much it changed over a period as short a 3 months?
Few questions I’d love to find out more about:
— Did you consider using media mentions or anything to capture a leading indicator that polling might miss near the election date?
— How did you calculate the party ID/swing for a given geography? Did you consider using a voter file?
— Wondering if you considered a Gaussian Process. Can you say more about any limitations you see in the design?
Great work!
Do you include a parameter for and an informed prior over a consistent bias across all polls?
Recalling the issue in 2016, do you handle correlated errors in the state polling?
If you click through to Economist article they say this, “We take into account that states that are similar are likely to move with each other; if Donald Trump wins Minnesota, he will probably win Wisconsin too.”
Which sounds like “yes” :)
This sounds like a model for correlated voting, but not necessarily for correlated *errors in polling* as in “regions that have more of type x people, who don’t answer polls, will tend to be skewed more towards the answers that type y people give” etc.
Good point. Maybe Andrew will comment on this, but it also looks like code is available.
Andrew, can you summarize key similarities and differences between your approach and 538?
Luke:
I’m not quite sure what 538 does. Can you send us a link to their code?
Burn..
You may not be “quite sure”, but you know something about 538 methodology, which represented a big step forward in popular culture (maybe the academic literature was on par w/ Silver in 2008). I am curious how you would qualitatively compare your models, in case their November predictions differ.
Luke:
I haven’t talked with Nate about his model for years, and I don’t know what he’s doing. I don’t know what the New York Times is doing either, I don’t know what Real Clear Politics is doing, etc. I guess they’re doing some sort of weighted average of the polls. I can’t really say more without knowing what they’re actually doing. If you can send a link to the code that Nate or others are using, then it would be possible to say more. I might not go to the trouble of looking into the code myself, but other blog readers might do so. Here’s a description of our model in words, if that helps.
Asked you this before but didn’t get a response. Have you ever looked at the Princeton Election Consortium? I think you might find Sam Wang’s approach there intersting. Or Scott Armstrong at Pollyvote?
Joshua:
I just looked these up now. The Princeton thing is based on poll averaging; the Pollyvote thing is based on averaging forecasts. Both are set up as operations on data rather than as statistical models. That doesn’t make these methods worse than ours; they’re just using an entirely different approach. I like our approach better because we’re using more information and we are explicitly modeling the time series in each state.
Do you take into account the crises that will be ginned up in the months and weeks before the election?
Rm:
No, except to the extent that this is implicitly included in the incumbency advantage and the error term of the forecast.
Last time, Kremp’s model last run threw 98% probability of Clinton winning the election. How’s different this time?
Guile:
Yes, Kremp’s model had problems. It did not appropriately combine data from state and national polls. There were some other issues too. We took the Kremp code as a starting point, but we pretty much rewrote the model from scratch. The Kremp example was helpful in letting us see what could go wrong.