Prediction markets and election forecasts

Zev Berger writes:

The question sounds snarky, but it’s not meant in that vein. It’s instructive to hear how modelers understand the predictions of their models, which is something I am still trying to think through.

Your model has the chance of Biden being elected at 0.95. Predictit has Biden at 0.60. Given the spread, do you have money on a Biden victory?

My reply: I wrote about this here and in section 2.6 of this article.

Relatedly, I received this email from Harry Crane:

Writing to call your attention to joint work with Darrion Vinson, which may be of interest to your readers.

We’re running a study that compares statistical forecasts against prediction markets for 2020 election cycle. We’re pre-registering our analysis by posting our methods ahead of time. The first version is here.

We also have an app that tracks the performance over time.

Currently we’re only comparing forecasts from 538 to the market at PredictIt.

I understand you’ve also designed a model for the Economist. If you have historical data of daily forecasts for President, House and/or Senate, perhaps we could add your method to our analysis in a later version.

I pointed him to the above links on betting markets for 2020, along with our election forecast (just google economist election forecast) with code, data, and predictions on github. I am only involved in the forecast for president so can’t comment on any forecasts for congress.

P.S. No 6-month lag for obvious reasons.

7 thoughts on “Prediction markets and election forecasts

  1. Any election model should be able to calculate the probability that a poll taken today with sample size X in state S will show Biden in the lead right?

    Would this be a good way to make and assess a large number of predictions *before* election day to compare dueling models?

    • Isaac:

      You can do better than that. Instead of predicting whether Biden is in the lead in the poll (that’s at most 1 bit of data), you can make a prediction of his share of two-party support in the poll.

      Such evaluations are fine. The trouble is that they are correlated, as a national swing will shift all polls. With no national swing (as we’ve seen so far during the campaign), our forecasts of intermediate polls will look underconfident: I’m pretty sure that more than half the poll outcomes will be in the predicted 50% intervals, more than 90% within the predicted 90% intervals, etc. But then if there’s a national shift, everything will pop out of the intervals. Even if all is correct in expectation, things won’t average out in any given campaign.

      Also, a model could predict polls carefully but still get destroyed in the election itself, because a prediction of polls will be unaffected by a national error term saying that all polls can be off in one direction or another (and, yes, our model has such an error term).

  2. Not really related to the blog post, but this is your latest election model article so I’m sticking it here:

    I discovered an interesting negative correlation between states in the Economist model. It doesn’t rely on any particularly unlikely scenario, and seems pretty robust (I’ve checked that it appears consistently in multiple sets of 40000 simulations.)

    Given that Trump wins Texas, Florida, Ohio, and Pennsylvania (~5% chance), how does him winning or losing Nevada affect the odds of him winning North Carolina?

    In this particular case, the odds of Trump winning Nevada are not too far from 50%. But in simulations where Trump wins Nevada (738), he has a 74% chance of winning North Carolina. In simulations where he loses Nevada (1147), he has an 81% chance of winning North Carolina.

    None of these scenarios are that uncommon, so I don’t think it’s a fluke. Is this kind of negative correlation unexpected? Is it similar to the kinds of negative correlation you see in the tails of the Fivethirtyeight forecast?

    If I have the time I’m going to try to write a script to search for weird correlations automatically…

    • Can’t you explain this behavior because the model is incorporating information from both state and national polls? If Biden is up 9 points nationally and he loses Nevada, we can chalk this up to state-level polling error. In order to explain why he was doing so well nationally (but so poorly in Nevada) the simplest explanation is that there was a polling error in the opposite direction in some other states to make up the difference in the national polls. I think this behavior would account for the negative correlations.

    • “If I have the time I’m going to try to write a script to search for weird correlations automatically…”

      If you can do that, maybe Nate Silver wants to hire you to write a garbage-identifying wizard that finds all the negative correlations and applies a bodge so that they are slightly positive. I’m kind of surprised that the plug-and-play model released by 538 didn’t already have that as damage control.

      I just cannot imagine developing a model with all these moving parts that does NOT blow up on the margins. If the model is tuned to be responsive to small changes in things like fundamentals, interaction effects would have to be damped when there are big differences between states. In other words, I don’t see how a complicated model like this can avoid negative correlations without a garbage identify/repair protocol.

  3. Snark is a word that blend snake and shark, inventing by mathematician Lewis Carrol whose interest in little girls would today place him under political suspicion. Meaning of snark is clear enough to see.
    Snarky is extension of that.
    Snark will soon be available as a verb.
    Snarker is a person who snarks, or comments snarkily.
    Snarkism have the potential to makes marginalized and vulnerable people have unsafe feelings. Activists must advocate Anti-Snarkism with moral clarity. feeling to advocate with insufficient moral clarity (aka “inner light”) places one under suspicion (aka, is “concerning,” “troubling,” and “problematic.”

Leave a Reply

Your email address will not be published. Required fields are marked *