Skip to content
 

Post-election post

A favorite demonstration in statistics classes is to show a coin and ask what is the probability it comes up heads when flipped. Students will correctly reply 1/2. You then flip the coin high into the air, catch it, slap it on your wrist, look at it, and cover it up again with your hand. Now what is the probability of heads? It’s 1 or 0 to you, but it’s still 1/2 to the students.

This is a post-election post in the sense that it takes place after the voting has done, but it’s a mid-election post in the sense that not all the votes have been counted.

I’ll do some probability calculations based on our forecast, but first let’s go over the big things.

Our pre-election forecast gave Biden a win probability beginning in the 80-90% range and concluding at 97%. Right now it looks like Biden is favored to eke it out in the electoral college, but in any case we can project the outcome near the lower end of our predicted 95% interval, which was [259, 415]. Such things happen, but you don’t go into a forecast anticipating such an unlikely event, so it pushes us to question our assumptions. A couple days ago I gave three good reasons not to believe our numbers, and we should return to that.

Our final pre-election forecast gave Biden a 99%+ probability of winning the most votes, with a 95% predictive interval of [51.6%, 57.1%]. It’s safe to say that Biden didn’t win anything close to 57.1% of the popular vote. 51.6% is closer to the mark. We can’t be sure at this point, but we can guess that his popular vote share was, again, at the low end of our 95% forecast interval.

Our forecast was based on state polls, national polls, and fundamentals, all of which favored Biden. None of these pieces of information is perfect; indeed, the reason our forecast interval of [259, 515] was so wide was that it included for the possibility of systematic errors in all three of these sources.

I don’t think we need to talk too much about the fundamentals right now, except to say that with political polarization we could argue that the fundamentals-based prediction should’ve been more closely anchored at the results of recent national elections (Democrats with 51% to 52% of the national two-party vote), giving less weight to economic factors.

How were the national and state polls off by so much? I can’t say much right now given the information available to me. My first guess would be a combination of differential nonresponse (Trump supporters being less likely to respond to polls and Biden supporters being more likely) and differential turnout (Trump supporters being more likely to go out and vote and Biden supporters being less likely). Other possible factors include differential rates of ballot rejection and last-minute changes in opinion among undecided voters. I’m not sure what to think about the differential nonresponse explanation, as some of the polls did adjust for party identification as well as demographics, but there still can be bias even after making these adjustments.

If we could go back and fix our model of the polls, what would we do? For the reasons we and others have discussed, that 96% forecast looked overconfident even at the time. We’ve written about incentives for over- and under-confidence in forecasts, but in this case what was driving the high probability was not so much any social incentives as much as the logic of the statistical model. But if the result is so extreme this suggests a problem with the model—somewhere.

Speaking in general terms, I can see four ways the model could’ve been modified ahead of time to produce forecast probabilities closer to 50%:

1. Wider uncertainties. Our model already accounted for the possibility of large polling errors at the national and state levels. We could’ve made these intervals even wider, on the theory that future errors can be more extreme than the past. This is kind of a boring answer, but increasing the uncertainties is a start, and our model had so many moving parts that we perhaps didn’t fully understand how uncertainties in survey response and voter turnout would interact.

2. A systematic shift toward the Republican party, based on the theory that the polls were off in 2016 and this election will be similar. We were assuming the problems from 2016 would’ve been addressed, but maybe that was a mistake.

3. Incorporation of additional information such as new-voter registration numbers that favored the Republicans in several key swing states. Any model is conditional only on what it includes, and in retrospect if turnout was an issue than voter registration data could be relevant, even if in some past elections such numbers were not so predictive.

4. A directional shift in the model, not toward the Republicans but toward 50/50 in the electoral college or maybe 51/49 in the popular vote based on the idea that (a) with political polarization, votes are stable, and (b) it’s appropriate to be uncertain about the predictions of interest. This takes us back to the fundamentals model, but we could also think of this “shift toward 50/50” as an external factor applied after all the modeling is done: If the forecast says 95%, we should believe 80%; if the forecast say 80%, we should believe 60%; etc. I’m not so sure how to think about this. For example, suppose we were applying this model in 1984, the day before Ronald Reagan was reelected? Would we really only want to give this an 80% probability? Maybe.

There’s also the option of saying that our forecast was fine and we just had bad luck. I don’t buy that, though. What happened was that we were whacked by a big-ass error term, but “error terms” come from somewhere. It wasn’t some god rolling dice. The real question is how could we have better anticipated this. Lots to chew on. I will say that there is a perverse benefit to making a strong prediction that didn’t turn out well—again, you don’t want to be at the edge of that 95% interval if you can avoid it—and that’s that it pushes us to really think hard about what went wrong with the model.

Conditional probabilities

As I’m sure you’ve heard, poll-based forecasts, including ours, were off in several key states. Given the information that’s readily available to me, it’s hard to be sure exactly how large the errors are. For example, our point forecast for Biden’s share of the two-party vote in Florida was 51.7% with a 95% interval of [47.8%, 55.5%]) the general picture is here but I’m actually working off the simulations that are linked near the bottom of that page:

> sims_economist <- read.csv("electoral_college_simulations.csv")
> florida <- state_sims[,"FL"]
> mean(florida)
[1] 0.5166
> quantile(florida, c(0.025, 0.975))
  2.5%  97.5% 
0.4779 0.5552

But in the newspaper Biden’s Florida vote share is currently reported as 48.2%, so if that particular number holds up we’re already talking about an unexpected event. Not outside the 95% predictive interval but certainly a surprise.

We can do some analyses conditioning on Biden only getting 48.2% of the Florida vote. (We could simply condition on him losing Florida, but that’s not quite right, if we have a sense as here of how much he lost by.) This can be done directly using the simulations. For the purpose of computational stability, we can condition on simulations that are within a narrow band of the reported outcome:

subset <- abs(state_sims[,"FL"] - 0.482) < 0.005

We can then compute Biden's expected national vote share conditional on this result, along with the probability of him winning the electoral college:

> popvote <- sims_economist[,"natl_pop_vote"]
> evote <- sims_economist[,"dem_ev"]
> mean(popvote[subset])
[1] 0.5248
> mean(evote[subset] >= 269)
[1] 0.7787

He only needs to win 269 because it seems that he won that contested electoral vote in Nebraska. So an estimated 52.5% share of the popular vote (down from the 54.4% in our forecast!) and an estimated 78% of winning the electoral college (down from our notorious 96%, which seems to have crept up to 97% on election day).

OK, now let's do all the states conditioning on Biden getting 48.2% in Florida:

> biden_wins <- state_sims > 0.5
> conditional_biden_win_prob <- apply(biden_wins[subset,], 2, mean)
> round(conditional_biden_win_prob, 2)
  AK   AL   AR   AZ   CA   CO   CT   DC   DE   FL   GA   HI   IA   ID   IL   IN   KS   KY   LA   MA   MD   ME   MI 
0.01 0.00 0.00 0.27 1.00 0.99 1.00 1.00 1.00 0.00 0.30 1.00 0.11 0.00 1.00 0.00 0.00 0.00 0.00 1.00 1.00 1.00 0.90 
  MN   MO   MS   MT   NC   ND   NE   NH   NJ   NM   NV   NY   OH   OK   OR   PA   RI   SC   SD   TN   TX   UT   VA 
0.94 0.00 0.00 0.00 0.33 0.00 0.00 0.92 1.00 0.99 0.76 1.00 0.06 0.00 1.00 0.70 1.00 0.00 0.00 0.00 0.11 0.00 0.98 
  VT   WA   WI   WV   WY 
1.00 1.00 0.89 0.00 0.00 

Let's just look at the states where either candidate has a reasonable chance:

> close <- conditional_biden_win_prob > 0.05 & conditional_biden_win_prob < 0.95
> round(conditional_biden_win_prob[close], 2)
  AZ   GA   IA   MI   MN   NC   NH   NV   OH   PA   TX   WI 
0.27 0.30 0.11 0.90 0.94 0.33 0.92 0.76 0.06 0.70 0.11 0.89 

We know more than this, of course, but not a lot more. Of the above states, we know that Trump won Iowa, Ohio, and Texas; Biden won Minnesota and New Hampshire, and the others are too close to call. We can condition on those outcomes:

subset <- abs(florida - 0.482) < 0.005 & (state_sims[,"IA"] < 0.49) & (state_sims[,"OH"] < 0.49) & (state_sims[,"TX"] < 0.49) & (state_sims[,"MN"] > 0.51) & (state_sims[,"NH"] > 0.51)

They're still waiting for mail ballots so I don't feel comfortable conditioning on the reported totals, but I'm conditioning on the winners' share being more than 49% in each state, on the theory that if it could be closer than that, the news organizations wouldn't have felt confident enough to make the call.

What happens conditional on the results now?

> round(mean(popvote[subset]), 3)
[1] 0.522
> round(mean(evote[subset] >= 269), 2)
[1] 0.79

Biden's then predicted to win 52.2% of the national vote and, according to the model, has a 79% chance of wining the electoral college.

And here are the close states again:

> round(conditional_biden_win_prob[close], 2)
  AZ   GA   IA   MI   MN   NC   NH   NV   OH   PA   TX   WI 
0.17 0.22 0.00 0.96 1.00 0.22 1.00 0.68 0.00 0.74 0.00 0.96 

OK, that's all based on a model that, as we've discussed, had problems. I'm not intending the above numbers to be forward-looking predictions (Biden with a 96% chance of winning Wisconsin, etc.), just working thru what the model is saying.

47 Comments

  1. David says:

    I know it is not the most scientific take, but it sure as hell feels like there is some sort of “Trump” effect when it comes to the polls. As best I understand, we didn’t see these huge, correlated errors in 2018 with other Republicans (sure there were misses but not so big, and not always favoring one side), and we don’t see this with populist candidates in Europe. It somehow seems as if Trump’s repeated, 5-year assault on the media and press has led to massive differential non-response or even outright dishonesty when interacting with pollsters (Interestingly Ann Selzer’s last Iowa poll seemed to nail it, and pointed to people “coming home” to Trump-if so, this would go against the idea that people won’t say…they just somehow wait till the last minute and it takes the right expertise to model/extract it). Either way if this is true-not sure if with candidates like Trump models like yours can be useful. Simply adding more uncertainty does not fix bias nor does pushing things towards 50-50 tell us anything useful. Your model relies on unbiased poll error, unless you can model and predict the bias, it seems no so much better than simply guessing.

  2. Boaz Barak says:

    At some point I looked at both your and the 538 model https://windowsontheory.org/2020/10/30/digging-into-election-models/
    As different as they are, they seemed to give similar values to the probability of Biden winning conditioned on any particular national popular vote margin. 538 just had fatter tails on the national popular margin.

    I am not sure that adding those tails (e.g., trump winning popular vote by 2% or Biden winning by 17%) was justified on its own, but adding this much national uncertainty is some way to account for “unknown unknowns” at the states level, and in particular for correlated polling errors that may be not national, but is correlated across the states that matter in a way we didn’t see before.

  3. Doug says:

    Small goof in your code though based on results below you fixed it and then copied/pasted the old code

    > close 0.05 & conditional_biden_win_prob < 0.05 (that should be 0.95, right?)

  4. Carlos Ungil says:

    Some news outlets have called already Arizona for Biden [1], which is also a bit surprising according to the model (17% probability in your last analysis).

    I think conditioning also on that will increase Biden’s win probability from 79% to more than 90%.

    [1] https://www.nytimes.com/interactive/2020/11/03/upshot/network-race-call-tracker.html

  5. Anon says:

    Look at how many votes seem to not even be counted because of USPS issues:

    https://twitter.com/johnkruzel/status/1324004554485211136/photo/1

    As you’ve written recently, maybe the issue was you are evaluating your model based on counted votes but you modeled something else entirely. If you can’t model something (like vote suppression/USPS incompetence), I think you need to rethink your whole approach given how the public consumes your forecast.

  6. Damian says:

    Are you aware of specific troll strategy employed by (some number of) Trump voters? They describe it in discussions on their subreddits and such–they explicitly lie to pollsters, claiming to be Republicans or Independents who are going to vote for Biden. Their explicit strategy is to deflect poll predictions and create opportunities to force misallocation of Democrat resources or instill overconfidence (it pairs with an internalized strategy of GOTV on election day).

    This is something they brag about doing, amongst themselves. It’s not a “shy” voter, it’s a Poll Troll–not shame or indecisiveness but a conscious tactic.

    I’ve been thinking about this for a few weeks, ever since I became aware that this was a strategy proudly embraced in their online circles. I don’t know how widespread it is among the general electorate. But if even 3 percent of Trump voters employ this strategy consistently when they are polled, it would account for a significant systematic polling error, wouldn’t it?

    I don’t know if that’s what happening this week (or, more accurately, if it’s a significant part of the story)–generally everyone I’ve brought this up with has told me it couldn’t possibly be a widespread phenomenon. But Ron Paul voters learned how to break state straw polls and non-scientific “who won” network text polls, and it took a little while for the system to learn how to screen them out. I’m not sure this is much different. If you know what to tell a pollster to throw chaff into the models, and if you have a small but committed faction that do this consistently, I don’t see any reason it wouldn’t work.

    I don’t know what you’d do about that–you’d have to counterprogram the polls to account for an uncertain number of sabotaging responses. But, if I’m right about this at all, there’s no reason this had to come as a complete surprise to the polling/modeling community–it wasn’t a particularly well-kept secret, if you were willing to look at the discussions in their own communities.

    Submitted humbly–it’s quite possible this phenomenon was on-line only and constituted only a vanishingly small percentage of Trump voters. I just don’t know. But SOMETHING was wrong with the polling, correct?

    • Andrew says:

      Damian:

      I’m skeptical, in part because in many cases you’d think it would hurt your candidate for him to perform badly in the polls: at some point, if Trump is performing badly enough, his political allies will take notice, right? My guess is this kind of thing is more talk than action, and that if they were really trying to manipulate the polls, they’d do it the other way by trying to make their preferred candidate look stronger. But who knows, I guess what you say is possible.

      I agree that something was wrong with the polling, but right now I’d still be more likely to attribute the “something wrong” to differential nonresponse, errors in estimating turnout, and votes that haven’t been counted.

      • Damian says:

        I fully agree–I’m skeptical too. Two points:
        1. Bad polls don’t hurt a candidate–at all–if the vast majority of his supporters have fully embraced an epistemology that says all the polls are biased and wrong. In fact, seeing polls that are wildly out of step with what they personally experience in their communities will SUPPORT this epistemology and strengthen the candidate. I feel like it’s hard for non-Trump voters to internalize how completely, how intensely, Trump voters have rejected any “mainstream” narratives. I really don’t think bad polls hurt him at all–the results speak for themselves. So while this would probably be a disastrous strategy for a centrist Democrat, it makes a lot more sense for Trump’s supporters to do this.
        2. Even if it IS a bad strategy, that doesn’t mean they wouldn’t do it. Lots of political factions do weird things. Again, I’m only taking them at THEIR word that they’re doing this. If they’re lying about lying to pollsters, well, why lie about lying? Anyway, you can’t lie to make your candidate look stronger–dishonestly is asymmetric; it can ONLY work to make your candidate look weaker. So I don’t know what you mean by “do it the other way.” How could a Trump supporter fabricate an answer that makes Trump look stronger?
        3. I agree, your solutions are probably more likely. But I wouldn’t toss out fringe ideas that might be (small? medium sized?) contributors to what looks like some pretty large systematic errors.

  7. Marc says:

    Andrew,

    Thanks again for the transparency. Seeing you process in real time is a great case study in the modelling process.

    It looks like the results will net be at the edge of the predicted confidence intervals. It may very well be just inside for 538 and just outside for your model, but the two had more similarity than difference.

    My main thought today is epistemological humility. We (the public who look carefully at election prediction) know less about American voters, and less about how to predict their votes, than we thought we did.

    For future modelling, I think that biggest implication is a need to widen our confidence intervals. Not only in the tails, but throughout the distribution. Second (or maybe just another aspect of widening confidence intervals) we should expect more correlation in polling error. Residuals do not seem very random (or maybe I am suffering from heuristic availability bias). Third, I would look more at the usefulness of the fundamental model, and be prepared to scrap it. Given the increase in polarization, and the fact that a once-in-a-century public health crisis and a once-in-a-century economic drop had little effect on voting intention, I don’t know that historic fundamental data is useful in predictions post 2016. Finally, although it will be very difficult, modeling the link from voting intention to actual vote (i.e. turnout and vote counting) is becoming more important than ever.

    • anonymous says:

      “Given the increase in polarization, and the fact that a once-in-a-century public health crisis and a once-in-a-century economic drop had little effect on voting intention, I don’t know that historic fundamental data is useful in predictions post 2016.”

      +1 100% – Trump’s electorate already think they should be better-off than they were pre-covid, and blame covid’s further economic fallout on the deep state, anti-Trump officials, etc.

      Correct for this — along with the “something wrong” items in Andrew’s response to Damian above — and you will have a significantly more accurate forecast, IMO.

    • Martha (Smith) says:

      Marc said,
      “Thanks again for the transparency. Seeing you process in real time is a great case study in the modelling process” and.

      “My main thought today is epistemological humility. We (the public who look carefully at election prediction) know less about American voters, and less about how to predict their votes, than we thought we did.”

      +1 to both points.

  8. Dave says:

    I’m curious about conditioning on the percentage of the popular vote that a candidate is likely to get (curious enough to go to the source maybe if I can find some time). Seems to me that getting more than 52% is actually very unlikely.

  9. Andrew E says:

    I think a lot of this points to Taleb’s assertion about forecasts not correctly pricing information entropy being correct. As a result of that you get item 1 about needing wider uncertainties. Items 2-4 strike me as narratives which may be correct, but would have been difficult or impossible to predict and need to price them in as uncertainty beforehand.

  10. Dogen says:

    As I recall you’ve explicitly said, several times, that your model makes no attempt to account for voter suppression effects.

    These efforts have, at many places and times, been 100% effective. For example, Georgia’s current governor. So I find it mind-boggling that you and most commenters here continue to avoid the issue.

    High level republicans openly admit to using voter suppression tactics. They have for decades. If it weren’t effective why would they bother?

  11. Jonathan (another one) says:

    “But in the newspaper Biden’s Florida vote share is currently reported as 48.2%, so if that particular number holds up we’re already talking about an unexpected event. Not outside the 95% predictive interval but certainly a surprise.”

    Sure, but shouldn’t you get a surprise or two when you forecast 51 states or state-like entities? Isn’t the right test here how many shares are outside your 95% predictive intervals? You say that the problem was you were “off in several key states.” Granting that “key” states are more intensively sampled and thus ought to have higher accuracy (although the multiplicity of polling will also increase the standard error of observed polls) aren’t you worried about something that isn’t a part of your model, namely “keyness?” If your predict outside the 95% confidence interval for CA nobody is going to care (as long as your aren’t forecast so high that you miss CA flipping.) An after-the-fact focus on a state with a prediction barely within the forecast interval because it is a “surprise” is really a surprise you ought to have expected, ie. no surprise at all.

    • Martha (Smith) says:

      Jonathan (a o) said,
      “Sure, but shouldn’t you get a surprise or two when you forecast 51 states or state-like entities? Isn’t the right test here how many shares are outside your 95% predictive intervals? “

      Makes sense to me.

  12. Nick Nolan says:

    Have you considered adjusting for difficulty of voting? Like number of polling locations per capita, drop boxes, length of time it takes to cast a vote.

    I mean, if it takes 4 hours to vote in some county and 20 minutes in another, opinion polls must transfer into votes differently. Florida isn’t a red state. It’s a “let’s make sure it’s very hard for black people to vote,” state.

    • confused says:

      >>Florida isn’t a red state.

      Given the shift in Miami-Dade, I am not sure this is true anymore. I think assumptions about demographic ‘inevitability’ of voting support / high-turnout being necessarily favorable to D may have to be re-evaluated after this election.

    • Martha (Smith) says:

      Nick said:
      “Have you considered adjusting for difficulty of voting? Like number of polling locations per capita, drop boxes, length of time it takes to cast a vote.

      I mean, if it takes 4 hours to vote in some county and 20 minutes in another, opinion polls must transfer into votes differently. “

      Makes sense to me.

  13. Marc says:

    Hot take.

    This wasn’t all that bad for the modellers (Economist and 538)

    As absentee count data comes in, it looks like Biden will have 270-290 electoral votes and a >5% margin in popular vote. That just at the bounds of the confidence interval from the Economist model, and within the confidence interval for 538.

    In other words, taking the models as our priors, this was a .05 ~ .15 likelihood event. Strong enough to move our future expectations, but not strong enough to conclude that we were “wrong”

  14. Given all the discussion happening now online over whether people should pay any attention to forecasts, I wonder whether we could avoid the post election ‘forecasts on trial’ that’s happened the last few elections by changing the communication to focus on relative changes over the last election rather than focusing on final outcomes like projected vote share, electoral college votes, probability of win etc. E.g., relative to 2016, Biden is expected to better survive polling errors than Clinton, turnout is expected to be about X% more, etc. It’s a small change and doesn’t solve any problems with bias in the input data, but I could see many people being less likely to quit trusting any polling or forecasting if the message was about the change, since then the potential for model error is sort of built-in to the communication. Reminds me of climate science, where journalists/public have a hard time not overthinking the specific predictions (like how much sea level will rise) whereas climate scientists are much more interested in what they learn about relative differences over time.

    • Anon says:

      I think the issue is still the insistence on a probability that gives the impression that “one outcome is overwhelmingly possible” and a point estimate. If Nate or Andrew or whomever wants to say: “this was within the margin of error” then the margin of error needs to be really really clear and the point estimate needs to be really really not as important. The point estimate is still the overwhelming focus and so when the margin of error still contains the outcome, no one really cares or appreciates that it was there.

      I think the visuals this time around were better but they need to go one or two steps further. Your suggestion is one way that could work for sure.

      • This is actually a topic I do research on (how to design representations of uncertainty so people won’t ignore them). To be honest, I’m skeptical that there’s much more we can do with the forecast displays, at least when it comes to better displaying quantified uncertainty. As an example – I recently did some work where we looked at people’s decisions and judgments of effect size in comparing two distributions, and specifically wanted to see how much they change when you add a mark showing the mean to different displays like error bars, density plots, and animated draws from each distribution, shown one at a time (https://arxiv.org/pdf/2007.14516.pdf) One would think that at least in some cases, like animated draws, adding static marks to show the means would have made a big difference in how people answered the question – since when they only see one draw at a time, they can’t rely on heuristics like how far apart are the means. But, actually, in none of the cases did means make more than a practically negligible difference. People are very good at suppressing uncertainty even when you think you’ve made it nearly impossible.

        This year, I think FiveThirtyEight made a good choice with Fivey Fox —though the messages they had him providing are even harder to evaluate than the forecasts, having someone whispering in your ear that even the really low probability outcomes could still occur is a decent way to get across the unquantified uncertainty. But, my thought is also that we need to go even further, like making it hard to get the absolute predictions at all.

        • Anon says:

          Below is a bit hand-wavey but is my thinking on why there may be more to try.

          I wonder if a more dramatic experiment design for research is needed. I agree with your analysis but I’m suggesting something stronger. I like your study but I’m not sure it transfers to the way people consume forecasts.

          Here’s an example. Instead of showing the point estimate at the top (and promoting it aggressively on twitter, while sidestepping the margin of error), I’m suggesting the journalist and the presentation align around persuading the audience to look at an interval and at the edges of the interval as real possibilities.

          I like Foxey but there’s still BOLD FONT with the POINT ESTIMATE in the center. This isn’t quite captured in your study so far as I can tell. The audience is still being guided to falling back into its old habits and to the habits of 2016. On top of that, on Twitter, Nate put more emphasis on his point estimate than on the tails, despite there being more net density in the tails than around a narrow center. Now he did make an attempt to discuss the edge cases but I’m suggesting putting even more emphasis on it and not using the bolding/point estimate as the display center.

          So my issue with the paper is there isn’t an agent (the journalist) and social media amplification (and groupthink) in your analysis. Perhaps it’s naive but I’m suggesting aligning the journalist and the display to promoting thinking at the margins more would more the needle, so to speak :).

          • Yeah, totally agree that the study is divorced from real world communication contexts. I think you’re right that there’s some real influence of what a forecaster/journalist seems to be highlighting as important (van der Bles, Freedman, and Spiegalhalter did a study somewhat related to this, where they found that verbal description of something as uncertain in a mock news article was more influential on people’s beliefs about the phenomena than numeric estimates were. Maybe this could help these models become closer to tools for reasoning about election dynamics more than oracles. I wonder if they would lose some of their appeal though, by no longer playing to the average reader who is glancing at these things repeatedly but generally always in very short increments, and doesn’t really want to learn so much as get answers.

    • Martha (Smith) says:

      Jessica said,
      “Given all the discussion happening now online over whether people should pay any attention to forecasts, I wonder whether we could avoid the post election ‘forecasts on trial’ that’s happened the last few elections by changing the communication to focus on relative changes over the last election rather than focusing on final outcomes like projected vote share, electoral college votes, probability of win etc.”

      Makes sense to me — but I suspect that many people enjoy the “forecasts on trial” focus, treating elections as something like a sport.

  15. Dale Lehman says:

    Regarding polling (in)accuracy: I know that many polls adjust for party of respondent, as they should. But I am also struck by the incredible urban/rural division in the vote – I believe it is more than I’ve ever seen before. So, I was wondering if the polls adjust for location – urban vs rural. To some extent this will be picked up by party affiliation, but that may not be adequate. Suppose that rural voters are under-represented in the polls, but that it is closer to a 50-50 party split in the non-respondents in rural areas compared with the urban areas. I am purely speculating, but two things seem to stand out to me from the results – the first is that the polls appear to have underestimated Trump’s support, and the second is that virtually every state has lopsided rural/urban votes while many have very close statewide votes. In fact, I find the intrastate divisions more startling than the interstate divisions, and I wonder if the polls have addressed this.

    • fogpine says:

      +1, with the proviso that would also be important to ensure that polls are neither over- nor under-representative of suburban voters, who account for much of the battleground vote and lean Democrat in some areas but Republican in others.

      In a large Pew study (Mercer, Lau, and Kennedy. “For Weighting Online Opt-In Samples, What Matters Most?” 2018.), only 1 of 3 poll vendors was able to meet quotes based on population density, and it looks like adjustments to the poll findings did not include population density. I don’t know much beyond that, though.

    • confused says:

      I also wonder if the rural/urban divide may have actually deepened since the last presidential election, so is more than was expected based on prior knowledge?

    • Martha (Smith) says:

      Dale said,
      ” I know that many polls adjust for party of respondent, as they should. But I am also struck by the incredible urban/rural division in the vote – I believe it is more than I’ve ever seen before. So, I was wondering if the polls adjust for location – urban vs rural. To some extent this will be picked up by party affiliation, but that may not be adequate. “

      Good point. But fogpine’s addition that suburban voters also need to be taken into account is also important — it’s not so much a dichotomy as a trichotomy. (Still, socioeconomic status and race/ethnicity/religion are also relevant. So it really is complex.)

  16. Bill says:

    “A favorite demonstration in statistics classes is to show a coin and ask what is the probability it comes up heads when flipped. Students will correctly reply 1/2. You then flip the coin high into the air, catch it, slap it on your wrist, look at it, and cover it up again with your hand. Now what is the probability of heads? It’s 1 or 0 to you, but it’s still 1/2 to the students.”

    I’ve always done this with more stages. First, ask the probability that it will come up heads. Then flip onto the floor and before seeing how it came up, cover it with my foot and ask the same question. Then I look (but no one else can see) and I ask again. This is usually the point where some say 50% and other say it’s 1 or 0 but I don’t know which. I then ask whether it would be a fair bet for two students to take opposites sides of the bet at this point. Yes, it would since neither knows how it came up so (from a Bayesian POV) it’s still 50% from their point of view even though it is either 0 or 1 for me. I then say (always truthfully but they don’t know how truthful I am or whether I’m trying to trick them) what’s the probability. Now they have to guess how reliable my statement is. Typically they will say something like 80% or 90%, but not 50% and not 100%. I then invite a student to look at the coin and say what they saw. Almost always, the student will say what I said and then the probabilities that the class comes up with will go up accordingly although not necessarily to 1. Finally I invite any student to look at it who wishes.

    There was one instance where a student looked at it and contradicted what I said. On the second day of class! I don’t recall how the discussion finished that day, but I was immediately impressed by that student and mentally marked him as potentially remarkable. As it turned out, he was the best undergrad student I ever had in my almost 50 years of teaching. He turned down a Rhodes scholarship so that he could accept a Marshall and go to Cambridge, and after his stint at Cambridge (where he met his wife) he returned to the US and got his PhD at one of the best statistics programs in the country (where I had arranged for him to get support to work on projects there during several summers when he was an undergraduate).

  17. Danny says:

    How do the predications fare if you were to calculate error at 3-sigma?

  18. Sarah says:

    At the end of the day, an election model is only as good as the polls. As a modeler, I can imagine that it is really hard to start using factors that significantly alter the final poll averages given the very limited data set of historical data that we have to work with. You start fiddling with things, and before you know it the model can turn into garbage in, garbage out. Especially this year, we had a deep well of polling data to work with – the modelers really should not have had to fill in the gaps. Maybe the actual vote was an “unlucky” outcome, but more likely we just had bad polling methodologies again. Yes, you can adjust for that by increasing modeled volatilities (which is what 538 did), but at a certain point, a model becomes too uncertain to offer useful predictiveness.

    Instead of blaming the modelers, I think it would be better to (once again) focus on polling methodologies, because that is the heart of the problem. In particular, I think the polling community should look carefully at why 2016 was fairly bad, why 2018 polling was really good and why 2020 was terrible. Was it really bad luck or systemic bad sampling? In Florida, for instance, there have to be questions about getting a representative Latino sample. Obviously, the Pew study will provide important insight into these broad polling question, but hopefully some media organizations (NYT?) can do a very deep dive on this question as well. We definitely more work on these questions…

Leave a Reply