Is there a middle ground in communicating uncertainty in election forecasts?

Beyond razing forecasting to the ground, over the last few days there’s been renewed discussion online about how election forecast communication again failed the public. I’m not convinced there are easy answers here, but it’s worth considering some of the possible avenues forward. Let’s put aside any possibility of not doing forecasts, and assume the forecasts were as good as they possibly could be this year (which is somewhat of a tautology anyway). Communication-wise, how did forecasters do and how much better could they have done? 

Image of guy ignoring his girlfriend (variance) for another girl (expected value)

Image from Kareem Carr

We can start by considering how forecast communication changed relative to 2016. The biggest differences in uncertainty communication that I noticed looking at FiveThirtyEight and Economist forecast displays were: 

1) More use of frequency-based presentations for probability, including reporting the odds as frequencies, and using frequency visualizations (FiveThirtyEight’s grid of maps as header and ball-swarm plot of EC outcomes). 

2) De-emphasis on probability of win by FiveThirtyEight (through little changes like moving it down the page, and making the text smaller) 

3) FiveThirtyEight’s introduction of Fivey Fox, who in multiple of his messages reminded the reader of unquantifiable uncertainty and specifically the potential for crazy (very low probability) things to happen. 

Did these things help? Probably a little bit. I for one read Fivey Fox as an expression of Silver’s belief that something 2016-like could repeat, a way to prospectively cover his ass. The frequency displays may have helped some people get a better sense of where probability of win comes from (i.e., simulation). Maybe readers directed a bit more attention to the potential for Trump to win by being shown discrete outcomes in which he did (taking things a step further, Matt Kay’s Presidential Plinko board presented the Economist’s and FiveThirtyEight’s predictions of Biden’s probability of winning plinko style with no text probabilities, so that the reader has no choice but to get the probability viscerally by watching it.) While certainly steps in the right direction, if probability of winning is the culprit behind people’s overtrust in forecasts (as suggested by some recent research), then we haven’t really changed the transaction very much. I suspect that the average reader visiting forecast sites for a quick read on how the race was progressing probably didn’t treat the numbers too differently based on the display changes alone. 

So, what could be done instead, assuming news organizations aren’t going to quit providing forecasts anytime soon?  

First, if people are predisposed to zero in on probability of winning (and then treat it as more or less deterministic), we could try removing the probability of winning entirely. Along the same lines, we could also remove other point estimates like predicted vote shares. So instead, show only intervals or animated possible outcomes for popular or EC votes. 

If probability of winning is what readers come for, then a drawback of doing this is that you’re no longer directly addressing the readers’ demand. But, would they find a way to fulfill that need anyway? My sense is that this would make things harder for the reader, but I’m not sure it would be enough. We didn’t focus on an election context, but Alex Kale, Matt Kay, and I recently did an experiment where we asked people to judge probability of superiority and make decisions under uncertainty given displays of two distributions, one representing what would happen to their payoff if they made an investment, the other if they didn’t. We varied how we visualized the two distributions (intervals, densities, discretized densities, and draws from the distributions shown one pair at a time in an animation). We expected that when you make the point estimate much harder to see, like in the animation, where the only way to estimate central tendency is to account for the uncertainty, people would do better, but if we then added a mark showing the mean to the visualization, they’d do worse, because they’d use some simpler heuristics on how big the difference in means looks. But that’s not what we found. Many people appeared to be using heuristics like judging the difference in means and mapping that to a probability scale even when the animated visualization was showing them the probability of superiority pretty directly! Some of this is probably related to the cognitive load of keeping track of payoff functions and looking at uncertainty graphics at once (this was done on Mechanical Turk). Still, I learned something about how people are even more “creative” than I thought they could be when it comes to suppressing uncertainty. If similar things apply in an election context, they might still leave the page with an answer about the probability of their candidate winning, but it would just be further off from the model predicted probability. 

Another option might be for forecast pages to lead with the upper and lower bounds on the posterior estimates. Anon suggested something like this. I can imagine, for instance, a forecast page where you first get the lower bound on a candidate’s predicted EC votes, then the upper bound. This could be accompanied by some narrative about what configurations of state results could get you there, and what model assumptions are likely to be keeping it from going lower/higher. 

I suspect reframing the communication around the ends of the intervals could help because it implies that the forecaster or news org thinks the uncertainty is very important. Sort of like if Fivey Fox were the header on the FiveThirtyEight forecast page, with a sign saying, Don’t really trust anything here! And then reappeared to question all of the predictions in the graphics below.  You’d probably think twice. Some recent work by van der Bles, van der Linden, Freeman and Spiegelhalter looks at a related question – if you convey uncertainty in a news context with a simple text statement in the article (e.g., “There’s some uncertainty around these estimates, the value could be higher or lower”) versus numerically as a range, which affects trust more? They find that the imprecise text description has a bigger influence. 

In general, leading with discussion of model assumptions, which might seem more natural when you’re focusing on the edges of the distribution, seems like a good thing for readers in the long run. It gives them an intro point to think for themselves about how good model assumptions are. 

But at the same time, it’s hard to imagine whether this kind of treatment would ever happen. First, how much will readers tolerate a shift of emphasis to the assumptions and the uncertainty? Could we somehow make this still seem fun and engaging? One relatively novel aspect of 2020 election forecast discussion was the use of visualizations to think about between-state correlations (e.g., here, and here). Could readers get into the detective work of finding weird bugs enough to forget that they came from probability of winning?

It seems rather doubtful that the average reader would. From things FiveThirtyEight has said about their graphics redesign, the majority are coming to forecasts for quick answers. If the graphics still show the entire posterior distribution at once in some form, maybe people just scroll to this every time. As long as there are at least the bounds of a distribution, most people I suspect can easily figure out how to get the answers they want, we might just be adding noise to the process.

On a more informational level, I’m not sure it’s possible to emphasize bounds and assumptions enough to prevent overtrust and backlash if things go wrong, but not enough to make readers feel like there’s no information value in what they’re getting. E.g., emphasizing the ends of intervals stretching too far above and below 50% vote percentage suggests we don’t know much. 

So the open questions seem to be, how hard do you have to make it? And if you make it hard enough, are you also likely to be killing the demand for forecasting entirely, since many readers aren’t motivated to spend the time or effort to think about the uncertainty?  Given goals that readers have for looking at forecasts (not to mention incentives of news orgs) is a “middle ground” in forecast communication possible?  

Other options involve changing the framing more drastically. The Economist could have labeled the forecast as predicting voter intent, not directly predicting outcomes, as pointed out here. If readers stopped and thought about this, it might have helped. Still, it’s unclear that most readers would take the time to process this thoughtfully. Some readers maybe, but probably not the ones that top level forecasts are currently designed for.

Another option is to consider switching away from absolute vote shares entirely, to focus displays on what the models say about relative changes to expect over the prior election. I like this idea because I think it would make probability of winning seem less coherent. What does it mean to provide relative change in probabiliyt of winning over a prior event for which we don’t know the probability? Relative predictions might still answer an information need, in that people can interpret the forecast simply by remembering the last election, and all the context that goes along with it, and have some idea of what direction to expect changes this year to be in. But on the other hand, this approach could also be immobilizing, like when one’s candidate narrowly won the last election, but this one they’re predicted to have less of a narrow lead.  Maybe we need to give relative predictions over many past elections, so that the older one is, the more lenses they have for thinking about what this one might be like. 

At least in 2020, if a forecaster really wanted to emphasize the potential for ontological uncertainty, they could also tell the reader how much he or she should expect the vote predictions to be off if they’re off by the same amounts as in the last election. Kind of like leading with acknowledgment of one’s own past errors. But whether news organizations would agree to do this is another question. There’s also some aspect of suppressing information that might be unrealistic. Can you really hide the specific numbers while making all the data and code open? Do you end up just looking foolish?  

At the end of the day, I’m not sure the revenue models used by news organizations would have any patience with trying to make the forecasts harder to overtrust, but it’s interesting to reflect on how much better they could possibly get before losing people’s interest.

 

77 thoughts on “Is there a middle ground in communicating uncertainty in election forecasts?

  1. I think this does a good job of summarizing some options.

    I do think if the making the clear communication of the interval should be every visualization expert’s job between now and 2024 if they’re in the forecast business.

    Whether news orgs will accept the trade-off of accuracy over trading a false sense of hope for clicks is another story..

    • To convey the sense of what really is being forecast, one needs to be explicit about the nature of the quantities being measured, and the nature of the predicted quantities to a degree which may seem pedantic. A simple analogy is helpful: when the weather service predicts rain tomorrow with probability 50%, what actually are they saying? I used to think that it meant, given a historical record of November days with meteorological conditions just as they are today, it was found that rain fell on the following day roughly 1 in 2 instances. Well, it turns out that this ‘natural’ interpretation of what their forecasts mean is wrong; and what they actually mean by their 50% refers to something more complicated, which I do not fully see, but the 50% refers in part to a precipitation ‘area-coverage’.

      Now to the domain of survey sampling and the reports from which one wishes to make; about the ‘probability’ of a win. Like the rain forecasts, the word ‘probability’ gets carried along, but it is not a primitive term at all. The unconscious habit of supposing that it is a primitive term is the source of the greatest perplexity.

  2. Fantastic thanks for this. The communication of the forecasts seems almost more important than the forecasts themselves, so very pleased to see this analyis here and the interesting links to experiments. You’re fighting an uphill battle I guess, because of the very broad levels of statistical literacy of the people looking at the forecasts, their very different levels of time they are able/willing to invest in interpreting the forecasts, their varied prior beliefs and biases, their differing goals for what they want to get out of the forecasts etc. And then, measuring how well people have understood the forecasts is very hard. Very interested to see what changes are made in future given the lessons of the past few elections.

    One thing I’ve noticed on my small, biased sample of Twitter is people translating a 0.96 (or 0.89 or whatever) probability of winning 270+ electoral college votes into suggesting that the margin of the win should be very large, without necessarily looking at the distribution of simulated results. This probability on its own doesn’t tell you about the expected margin of the win of course – that requires the simulated results distribution, which both 538 and The Economist also presented (538 with that smoothing, which seemed unjustified). It wasn’t clear how much some people understood this, or how they interpreted the distribution in relation to the point win probability, but it obviously was causing confusion.

    The other thing I noticed on Twitter was that many people seemed to indicate 538’s comments in particular were hedging a bit too much to the point where their prediction was unfalsifiable (which seems to be a problem with these sorts of models anyway to an extent), further reducing their trust in the forecast. So it seems with these people at least, they’ve already gone too far in the direction mentioned in your last paragraph.

    • Nice analysis. But I feel it underestimates one psychological factor. Uncertainty is in the mind of the beholder… People tend to be heavily invested in the election outcome. And will tend to rationalize why their hope or fears might come true.
      And they will do this daily, closely following every up or down

      In my viw this is not about adequately communicating how uncertain the forecasts are.
      The most valuable addition would be to highlight how long it will take to call the winner on/after election day, in big bold font at the top of the page.

      • “Uncertainty is in the mind of the beholder” – yes definitely.

        “The most valuable addition would be to highlight how long it will take to call the winner on/after election day, in big bold font at the top of the page.” That’s interesting, it’s not like it was unknown, just not often communicated side-by-side with the forecasts. 0.96 probability of a Biden win, but we’ll only know on Saturday… does that make sense?

  3. I suspect there would be little tolerance for too much digging through the web page looking for that kernel of certainty that wasn’t there, before hitting the back button and finding a different website to fulfill the need.

    • Yes, as in uncertainty communication in the public sphere more broadly, it seems norms and conventions can play a big role. How to get out of a “bad equilibrium” of suppressing uncertainty or giving readers the probabilities they want is something I’d love to see more people thinking about.

      • I think this might be a human nature problem across the board… I’m not sure if it is norms and conventions or something deeper.

        I just finished an analysis on a nationwide longitudinal dataset. I used a Bayesian multi-level model, and I thought I had explained the results pretty well. During internal review, an MD writes back and says something to the effect that he knows there are no p-values in this analysis, but can’t we just do some simple test for the trend over time so that we can say it’s statistically significant?

        In my experience, it seems the majority of people, regardless of education, want ‘an answer’. Not a ‘maybe’.
        I agree that thinking about how to make ‘uncertainty’ qualify as ‘an answer’ rather than ‘a maybe’ sounds worthwhile. Slightly skeptical that is possible.

        • jd said,
          “I agree that thinking about how to make ‘uncertainty’ qualify as ‘an answer’ rather than ‘a maybe’ sounds worthwhile. Slightly skeptical that is possible.”

          The idea that uncertainty is an inherent part of nature is something that should be really important for psychologists to study — but I wouldn’t be surprised if the majority psychologists themselves have an aversion to uncertainty.

      • I unapologetically want a probability! Give the people what they want!

        I do recognize that there’s a problem. One of my friends was furious with 538.com in 2016 because they “promised” Trump was going to lose. I pointed out they were giving Trump around a 20% chance, in the days before the election, and that’s a long way from a “promise”…but my friend insisted that 80% is pretty much a sure thing and how dare they say there was a 20% chance if Trump actually had any chance at all of winning!

        Similarly, I’ve read that weather forecasters cheat their probabilities towards 50% for extreme probabilities, e.g. if they think there’s a 5% chance of rain they’ll say there’s a 10% chance, or even more.

        So I agree there’s a problem with people failing to understand uncertainty, but I think that goes beyond something that can be fixed just by finding the right way to frame model results.

        • More people should gamble, or play probabilistic video games. My intuition for probability has been greatly enhanced by failing at XCOM. If you always bet everything on 95% chances, you’ll eventually lose

        • There’s definitely something to this, not just in giving people practice with quantitative uncertainties but also with testing what people _really_ believe.

          With 538 giving Trump a 20% chance a few years ago, and my friend thinking of that as a “sure thing”, what odds would my friend have actually given if we had made it a wager? It the answer is, say, 20:1 then OK, my friend just doesn’t understand the math. But if it isn’t 4:1 or worse, that would be an acknowledgement that OK, yeah, they understand intellectually that it is not a “sure thing”, even if emotionally they’re treating it as a lock.

        • To make a wager one needn’t understand the math. And one who understands the math may not be one who feels competent to make a wager. What is a “wager” anyway? If I cross the street, I suppose I “wager” I’ll not be struck; but I guess it’s not a “wager” unless someone else dares me to do it, on the off-chance he’ll make a dime. What crap!

        • “Wager: risk (a sum of money or valued item) against someone else’s on the basis of the outcome of an unpredictable event; bet.” — Oxford Languages

        • Haha, yeah, XCOM is great. I played a bit, got hit with some unexpected contingencies, lost my team, and quit. It was pretty funny overall.

          I was thinking, maybe I didn’t learn any lessons cuz I just bailed. But maybe I did, cuz it’s like my XCOM career ended after a couple of those missions lol. Anyway, high quality entertainment.

  4. The Adam Pearce chart had problems tho in that he used his charts to argue 538’s unusual maps were a sign of a failed idea of demographic realignment. That wasn’t a good analysis in retrospect. In fact, the idea of demographic change they suggested turned out to be truer than not. We can disagree over whether 538’s error was modeled correctly but it isn’t clear that just having one new visual style necessarily leads the reader to the right conclusions if the writer doesn’t understand what they mean.

  5. Thanks for introducing me to the “ballswarm plot”. Although might I suggest “ball-swarm plot”, as I initially read it with the word break one character to the right…

  6. Jessica:
    Thanks, I learned a lot from your post and I’m glad to see so much rigorous thought on this important topic.

    To me, the main problem with the way 2020 election forecasts were communicated was the absence of disclaimers about the assumptions that the forecasts were contingent on. I’d suggest explaining the assumptions right in the headline, alongside the forecast’s point estimate or interval.

    Sometimes, people react to this kind of suggestion by saying, “That’s great in principle, but doesn’t work in practice. Forecast assumptions are long and complicated to state even in a way that experts understand. The typical reader won’t understand an explanation correctly, even if they read it, which they probably won’t. Complaints about failing to explain assumptions are a common and valid criticism about articles in the academic literature, but people who make similar complaints about election forecasts are forgetting that the forecasts aren’t meant for us, they are meant for more typical readers.”

    It seems like you have a similar perspective when you write,

    The Economist could have labeled the forecast as predicting voter intent, not directly predicting outcomes, as pointed out here. If readers stopped and thought about this, it might have helped. Still, it’s unclear that most readers would take the time to process this thoughtfully. Some readers maybe, but probably not the ones that top level forecasts are currently designed for.

    I see where you are coming from, but I still think there are great opportunities for communicating forecast assumptions in a way readers understand.

    Concretely, I’d like to suggest that “unprecedented” is the magic word, which is perfect for this use.

    For example, the Economist headline number could have been replaced with something like, “Trump has a 3% chance of winning the Electoral College, excepting unprecedented effects of COVID-19 or unprecedented levels of unfairness in the election.”

    I agree it is difficult to explain what scenarios a predictive model does and does not account for, but “unprecedented” has close to the right meaning to both non-expert readers and statisticians. It seems like this would get the message across to anyone who knows the word’s definition. I also think the message is important to provide alongside forecasts in a election that happens during a pandemic, and in which the powerful incumbent repeatedly refuses to say he will respect election results.

    Sorry for the overly long comment.

    • “The Economist could have labeled the forecast as predicting voter intent, not directly predicting outcomes, as pointed out here”. But it seems that the problem was not particularly with voter intent vs actual vote behavior, but rather systematic problems with the polls themselves. The polls underestimated support for Mr. Trump in important states, and either the polls or the model underestimated the amount of extra voter turnout.

      Neither making the estimated bound more clear n or reducing the importance of the mean estimate would have addressed those issues.

      • Tom, that’s a good point about what happened in 2020, and polling error may be worth mention too. To me, though, it’s more about stating the assumptions of a forecast up front, especially when those assumptions have a fairly large chance of deviating from reality (for example, regarding COVID-19, unfairness in ballot counting, and polling error).

        Maybe this wording is better:

        “We predict that Trump has a 3% probability of winning the presidency, assuming his chances are not affected by events that are unprecedented in other recent elections, such as could be produced by exceptional polling errors, unfairness in ballot counting, and COVID-19.”

        I’m sure that could be improved many other ways too, but the larger point is that it is possible to put a statement of assumptions alongside your headline model predictions, without the disclaimer being too long or technical for the audience.

      • Martha — To me, unprecedented means “without precedent” and the meaning of “precedent” is nicely communicated by the following example sentence

        * “Conditions have changed enormously, and the past is not much of a precedent.”

        I find that the word “unprecedented” is magically helpful when trying to explain how predictions from a statistical model both (a) don’t assume the future is exactly like the past and (b) will be wrong if the future is too unlike the past. It’s difficult to understand how that combination of (a) and (b) happens unless you work with regression models yourself, but the word “unprecedented” does the best job of explaining it that I’m familiar with.

        But maybe I’m wrong, and unprecedented means something different to me than it does to most others! That seems quite possible too. The definition you give is different than what I think of, so maybe some other word is more suitable, or maybe even the phrase “without precedent” is best.

        • I used the definition “never done or known before” for “unprecedented” because it is what came up from a web search, and agreed with my understanding of the word. Looking it up now in my desk dictionary (American Heritage, 1985) I get “Without precedent,” and for “precedent”, I get
          “1. a. An act or instance that may be used as an example in dealing with subsequent similar cases.
          b. Law. A judicial decision that may be used as a standard in subsequent similar cases
          2. Convention or custom”

          This sounds like your and my definitions each fit one of that dictionary’s definitions.

    • Yeah, I think I’m happy with the different plots of probability, and the different sorta outcome plots (the map Phil posted a couple posts back was cool).

      I thought the discussion of correlation in 538 vs Economist was interesting. Like I thought in one of the posts Andrew talked about being vulnerable to some national shift in polls (and then the whole discussion of, if Trump wins NJ, what happens, etc). So when the early states started looking a bit different from what was expected, it wasn’t so surprising that other states followed (they’re not independent).

      Anyway I wasn’t paying super close attention to the modeling and whatnot but I think the background assumption discussion that went on was really interesting vs. burrowing into what exactly 96% chance means.

  7. Isn’t part of the issue the fact that highly consequential get-out-the vote campaigns are launched (or maintained or altered) in response to (and therefore after) the forecasts are made? I really appreciated Andrew’s distinction about voter intent, because that’s a real thing, at least in principle, at the time the forecast is made. What’s not a real thing at the time of the forecast is propensity to show up and vote, because that depends on decisions that are yet to be made. Because of this, I keep thinking we should look for better ways to unpack “likely voter” assumptions. For example, forecasters might consider scenarios like “if the voter composition is similar to this or that previous election” (do we have useful data on voter characteristics?), or try to separate out quantifications for “persuadability” and “motivatability”. Sounds kind of impossible now that I see it written, but isn’t that what we’re up against?

    • Josh:

      I’m guessing that coronavirus and early voting had huge effects, both on total turnout and on differential turnout for the two parties. I doubt that the promulgation of forecasts had much to do with anything. The polls are supposed to estimate turnout too (using “likely voter screens”), but we’ve always known this is much more difficult than assessing vote intention.

  8. I think a good option would be to have an interactive graphical presentation of the modeled outcomes and their uncertainties, and then allow users to toggle an option for *correlated bias across states in the two-party vote share*. So, you could move that widget between say -5 and +5 (very generous range, this miss looks like around 2.5%). I know this is veering into meta-probability land quite hardcore, but eh, that’s where we are!

  9. It is worth asking “Compared to what?” These sites – yours and 538 – have been so much better than the alternative. Even major newspapers have articles about “poll shows …” (often a poll they commissioned). People over-react to small things.

    And you can also look at the betting markets if you want some alternative to poll aggregation and prediction. Another alternative is aggregation of probabilistic forecasts, but nobody seems to publish those when they concern elections. Maybe remind people of the alternatives.

  10. “if people are predisposed to zero in on probability of winning and suppress uncertainty…”

    This doesn’t make any sense to me. Any probability of a win other than 0 or 1 *is* an expression of uncertainty, not a suppression of it.

    • Some people find that claiming that the probability of something is 80% is too precise and would rather have an interval of probabilities like [75% 85%]. But I agree with you, I don’t quite get that reasoning.

      • Carlos, I think that my idea of being able to toggle one of the key uncertainty parameters might be the way to get best of both worlds here. Retain the interpretability of a conventional probability, but clearly show how it derives from assumptions.

        • I agree that some sensitivity analysis can be helpful to understand the model and how to interpret it. It’s also interesting to think about how much of the uncertainty comes from known unknowns and how much from unknown unknowns, how should we expect it to change with new information…

          A full model of probabilities of probabilities, something like Jaynes’s Ap distribution, is too complex (as you say probability is confusing enough before steeping out into meta-probability) but maybe one could decompose the model/forecast into a mixture of models/scenarios easier to grasp.

    • People don’t know what a probability corresponds to in terms of vote margin interval. So they map it onto polling, the other percentage they do know and that’s very relevant. If you want to use probability you have to teach people to see probabilities and outcome confidence intervals as a two-way map, but that’s not what they do.

    • Yes, technically it is, but the problem is that it doesn’t appear to be treated that way in how they make decisions. E.g., FiveThirtyEight’s 2016 forecast has a big header with two large numbers: percentage chance of Clinton winning, percentage chance of Trump winning. Many Democrats saw numbers in the 70-80% range leading up the election, then when Clinton didn’t win they were completely shocked. So there still seems to be suppression of uncertainty in that sense, even though its right there in front of them. I edited the sentence though, you’re right it was a bit confusing.

    • Kevin said,
      “This doesn’t make any sense to me. Any probability of a win other than 0 or 1 *is* an expression of uncertainty, not a suppression of it,” in response to ““if people are predisposed to zero in on probability of winning and suppress uncertainty…”

      I think the point is that most (or at least many, to be generous) people don’t realize that if there is certainty, then there is no need for probability.

  11. Expose some fundamental uncertainty/bias parameters, like how correlated states are and how much we can trust polls and how biased you think the polls are in one particular direction, but on the top level, show maybe 6ish “personalities” with different presets so you can tell a story “here’s what we think; here’s what the person yelling about shy Trump voters thinks; here’s what the person who doesn’t trust polls but doesn’t know why thinks; etc”. Only give them suggestive names. Basically promote the idea that your estimate of the chances depends on your priors and different folk seem to have different ideas. Like business folk do when determining what products to build, they make profiles, each representing wide swathes of users. Makes for much easier storytelling that allows for different perspectives.

    • Guy said,
      “Basically promote the idea that your estimate of the chances depends on your priors and different folk seem to have different ideas. Like business folk do when determining what products to build, they make profiles, each representing wide swathes of users. Makes for much easier storytelling that allows for different perspectives.”

      Sounds like a reasonable thing to try.

  12. “And if you make it hard enough, are you also likely to be killing the demand for forecasting entirely, since many readers aren’t motivated to spend the time or effort to think about the uncertainty?”
    Would that actually be a problem?

    • I think a big part of the problem is that we (society in general, education in particular) do not stress uncertainty enough, as just being part of what Mother Nature (or God, if you are so inclined) has tossed at us. (And many people’s aversion to uncertainty in general plays a big part in this.)

  13. I am not sure it is a communication problem. The reader wants to know which candidate will win, or at least which candidate will likely win, in the 2020 election. The model says historically, on average, polls are unbiased and vary within this range. The reader looks at the forecast and says I should trust the polls because this is a sophisticated scientific analysis showing which candidate will win. Then the polls are wrong to a significant amount and reader is upset. The only way reader is happy is if polls are reasonably accurate. If polls are reasonably accurate, then it is less clear that we need forecast model.

    If you look at the bounds then you get some analysis like Nate Silver’s final summary where he says it could be “landslide” or “nail biter”. How is that supposed to help the reader? You don’t need a fancy model to know Biden can probably survive large polling error; he was way ahead in the polls.

    In retrospect, was there ever really a 90+% probability of Biden winning with 30% probability of landslide and 75+% probability of trifecta? It seems like high polarization, incumbency advantage, and 2016 polling errors should have tempered expectations. Instead it is easy for people to lay out a list of reasons why it makes sense for their preferred candidate to be significantly favored.

    • Responding to myself, I can give one concrete communication suggestion. I would be very interested in the most likely path the trailing candidate has to winning. We know polls aren’t going hit the nail on the head, so throw some polling errors out there that create a narrative for a plausible path to victory for the trailing candidate. Maybe the polls tighten a little by election day, a small national error, some geographically/demographically correlated errors, and some state level errors that give a plausible path to winning for the candidate who is down in the polls. Walk me through it. Maybe include some real examples from past elections. Then, give some qualitative assessment of how likely that type of scenario might be. Maybe give a few examples of different scenarios.

      I don’t want 40,000 nondescript simulations that are centered around the current polling. I mostly want to know how likely it is that my preferred candidate can still lose (or win) this race. I’m not sure you need a sophisticated model to lay out that type of analysis and it is probably not something that you constantly update.

        • Andrew:

          I think you are referring to tipping point probability or something similar. That is not what I was trying to describe. I don’t mean a path to winning as in what states a candidate would need to win. I mean path as in what errors would have to occur for them to win. I want you to describe a scenario that has relatively high probability of occurring (or is representative of a type of similar outcomes) and would result in less favored candidate winning. I don’t mean simply polling errors of x, y, z in states a, b, c. I want a scenario like polls are off by X in sunbelt states or polls overestimate Trump to Biden voters, then additionally polling error of X in Pennsylvania, etc. I don’t think most people care about the full distribution of what is maximum Biden might could win by or what is maximum electoral votes Trump could win in most outrageous scenarios. I would prefer a few scenarios with qualitative assessment of probability with a particular focus on most probable scenario to flip outcome.

        • N:

          Oh, that’s easy. The scenario to shift Biden’s vote share to 51.5% (so that the election would be on a knife edge) would be that all the polls are off by about 3 percentage points, which is a lot but is in the realm of possibility. The polls can be off for various reasons, including differential nonresponse, differential turnout, and differential ballot rejection.

    • Yes, that’s a great point. And I agree, it all sort of falls apart on some level when you try to analyze it. In this case I was trying to independently discuss how much better forecasters could do at preparing readers for the highly unexpected, under an assumption that we can never know ahead of time how off the forecast might be due e.g., to polling error.

  14. There seem to be two different (but not completely independent) questions: how to communicate better and how to communicate less.

    Not giving point estimates to force the reader to look at the broad picture and appreciate probability intervals or entire posterior distributions is one thing.

    Avoiding large intervals that suggest that we don’t know much or hiding posterior distributions is something else.

  15. Readers—even readers with training in statistics—struggle not to translate a 90–95% chance of ending up on a given side of a binary election outcome with a clear, convincing, easy election evening for that side’s supporters. What readers really care about on election night is how they feel. So perhaps a helpful way to represent the model findings would be in terms of election night dynamics: comfortable lead for A; uncomfortable lead for A; neck-and-neck; uncomfortable lead for B; comfortable lead for B. Each scenario generated in the model could be categorized in such a way (and it wouldn’t necessarily match up neatly with the final EC results, as the 2020 experience shows).

  16. I dont think the general public will be able to grasp probability in election forecasts because they dont grasp probability period. The main emphasis should be on better data, meaning improving polling methodologies. About the only exposure people may get to forecasted probable outcomes is if they are interested in sports or sports betting, and they see that the projections are multiple runs of multiple scenarios. And in those cases, the bettors typically know the projections arent reliable. It’s an interesting irony that unreliability of projection in betting leads to so many businesses that sell predictions (and which all claim success!).

    But with major elections, you have forecasts that people at least want to treat as reliable. But these people have very limited exposure to the basic concepts involved, and very few people grasp much about probability. Improving communication is always desirable but unless the polling is accurate, you’re putting lipstick on a pig, to use an old sexist metaphor.

    • jonathan, what you are essentially stating is that due to the general public’s inability to grasp probability overall, election forecasting and polling will always be doomed to failure in terms of providing accurate information to the public about the state of the race.

      Am I correct about your interpretation?

  17. You talk about modelling differential non-response: I think you should also consider modelling the responses of people who believe that they are under social pressure to vote for and to appear to support a particular party – whether or not that pressure actually exists. There is previous work in studies of the prevalence of socially stigmatised personal behaviour.

    • This is what is referred to in polling as the “shy voter” effect; there’s also a related issue called the Bradley Effect. If you read these posts and comments with that lingo in mind, you’ll see that there is a fair amount of discussion about it.

  18. I think you’re right that there is only so much that presentational changes can help, and that the reason is because they only obscure the very thing that people are looking for.

    The problem is not that forecasts aren’t presented carefully. The problem is that they’re not accurate enough. People don’t want to know what the confidence interval is; they just want narrower confidence intervals. They don’t want to know what the sources of error are; they just want less error.

    Of course, that’s likely impossible. But people want the impossible all the time! :-)

    I think the simplest way to reorient the presentations would be to do as much as possible to present the models as very distinct from the actual events they model (i.e., the election). Instead present them as simply a fun exploration of some of the parameters. In other words, don’t even pretend to be forecasting voter intent, vote share, etc., let alone the election outcome. Just say you’ve built a simulation that is sort of like an election and here’s what it does.

    In the early stages of the coronavirus pandemic, we saw a number of such tools illustrating the phenomena of exponential growth, R0, herd immunity, etc. All the ones I saw explicitly disavowed being models of the actual pandemic; they were simply models of abstract virus-like growth patterns on a grid of dots which were meant to illustrate principles that might be relevant to understanding the pandemic. The remarkable interactive works by Nicky Case are other examples.

    The “problem” with this is that it means forecasters have to stop claiming that they are forecasting the election. They instead have to say they’re building a model that is sort of like the election and letting people look at it and maybe play around with it. But I think this could be useful in shifting the focus from the election itself to how the model works, which can help people see what factors the model takes into account and what factors it doesn’t, and thus understand its limitations. For instance, imagine an interactive model where the user could insert a fake poll with particular results and observe its effect on the model (akin to adjusting the R0 in one of the pandemic simulators). Another “problem” with this is it requires the modelers to expose more of the innards of their models, not just the top-line prediction — but again maybe that can be seen as an advantage.

    For better or for worse, I think there is a minimum level of accuracy that a model has to have for people to consider it worthwhile. And, unfortunately, I think the more infrequent and important the event that is modeled, the HIGHER that minimum is. In other words, it is because elections are so infrequent that they are difficult to predict, but it is also because they are infrequent that people want the predictions to be very accurate (since they will have to live with the results for quite a while). This is a fundamental paradox with predicting things like elections. Ultimately the models have to either get closer to reality (i.e., be more accurate) or explicitly distinguish themselves from reality (i.e., disavow any intent to predict the actual election).

  19. Very hard problems. Four thoughts.

    1. Consider the audience. You have to assume, surely, fairly casual readers with a broad range of statistical sophistication, whose interest is in the question “Who’s going to win?” They are likely to think you are answering that question, whatever you say. (Consider how often people, even people who know, misinterpret “significance”, treating it as answering the question they want to ask (“Is this result true and important?”) even though it doesn’t answer that question, and even though they in some sense “know” that it doesn’t answer that question.)

    2. My own experience suggests that how you present an estimate makes a big difference. I’m a lawyer. We are often asked to provide estimates of our assessment that someone will win a case. These are not “modelled estimates”, just subjective judgements, and very rough and unreliable. I’ve found it makes a huge difference how you present them. Take a typical case where my client has the edge but not by much. If I say “60 percent chance of winning” they tend to see this as good news, even nearly a guarantee of victory! If I say “it’s really a toss up, you’ve got the edge but it wouldn’t be at all surprising if you lost” they see that as much riskier. If I say “suppose we fought this point ten times, you’d come out winner about 6 times, and loser about 4, and please remember that on those four times you lose, you have completely lost “, they tend to see the risk more clearly. In my own practice, I generally prefer to avoid the numbers, but if clients want them, I find the explanation in terms of imagined frequencies generally seems less likely to produce false optimism and certainty, though that’s pure anecdote.

    I haven’t found ranges help much to communicate uncertainty. I’ve generally found it more helpful to put it in words “Just remember this is early days, and a lot is going to change. Anything I say now is more or less guesswork, and you really shouldn’t be taking it too seriously at this point.” One reason I think the words work better, is that as soon as you put numbers on something people see that as a reflection of “precision” and they confuse “precision” with “certainty”.

    3. One needs to watch for other confusions, which are almost subliminal. For instance, if I say “Biden has a 75 percent chance of winning”, do people subliminally confuse this with “Biden will get 75 percent of the vote”? If I say “Biden is ahead in most of the polls”, do people subliminally confuse this with “Biden is very much ahead in the polls”? I am sure readers understand the differences, but I suspect this sort of confusion can operate subliminally. Probably especially with graphics. (Compare eg the way that people who are told a test has a low false positive rate tend to assume that this means a positive result is very strong evidence of what the test explores, and I think often do so *even when they know about base rates* unless you force them to think it through.)

    4. Let’s not lose sight of the fact that the BIG problem is not communicating the uncertainty of a forecast, it’s getting a reasonably reliable forecast. If the basic data going in to it is so mucky that the forecast has become a stab in the dark, *any* sort of presentation that makes it look “scientific” and “precise” is liable to confuse. If polling remains as uninformative as it seems to have been in the past two elections, any forecast at all is going to have problems.

  20. > At least in 2020, if a forecaster really wanted to emphasize the potential for ontological uncertainty, they could also tell the reader how much he or she should expect the vote predictions to be off if they’re off by the same amounts as in the last election. Kind of like leading with acknowledgment of one’s own past errors. But whether news organizations would agree to do this is another question

    Actually, the NYTimes featured exactly that information.

  21. Maybe report two probabilities instead of one: the lower and upper bounds (e.g. 95% conf intv). It would just be a transformation of X% ± Y% to (X-Y)% – (X+Y)%.

  22. Here’s a suggestion about communicating election forecasts that I think should be both useful and uncontroversial:

    Using simulations from the model, report the chances that the election winner will hinge on a small proportion of all votes, like only 50,000 votes (which is 0.03% of all votes) or only 25,000.

    Because of the electoral college, this happens much more often than is intuitive, even when one candidate is heavily favored. For example, even though the Economist model gave Biden a 97% win probability, I found that in 8.6% of the model’s posterior simulations, the identity of the election winner depended on less than 0.03% of the votes cast (less than 50,000 votes). Similarly, FiveThirtyEight gave Biden a 90% win probability, but analyzing its simulations shows a 14.1% probability that the winner would depend on less than 0.03% of votes cast.

    Reporting values like this can help explain how 97% and 90% win probabilities are consistent with having only a small number of votes that separate winning from losing.

    Peoples’ feelings of betrayal about forecasts seem to be associated with the very small numbers of votes that often separate winning from losing, even when one of the candidates is heavily favored by the forecast. For example, Clinton was heavily favored in 2016 and Biden was heavily favored in 2020, but a change in only 38,875 votes (0.03%) would have switched the 2016 election from Trump to Clinton and a change in only 26,885 votes (0.02%) would switch the leader in the 2020 election from Biden to Trump as of Friday. And in 2000, Gore was favored (I think?) but lost by only 269 votes (0.000002% of all votes cast).

    It could help readers understand these realities of the electoral college system if forecasts informed them about how often tiny vote differences are anticipated to separate the winner from the loser.

    • Here is evidence to substantiate my statements, showing how forecasts can heavily favor one candidate, yet at the same time predict that the election will often depend on only a tiny proportion of all votes cast. This is a consequence of the electoral college.

      These are histograms showing the number of votes that, if switched to the other candidate, would change the election’s winner in Economist and FiveThirtyEight predictions.

      Economist: https://i.postimg.cc/sXSZ1M7z/election-sensitivity-economist-model.png
      FiveThirtyEight: https://i.postimg.cc/5NfFNN0c/election-sensitivity-538-model.png

      The Economist model gave Biden a 97% chance of winning, but I found it predicted that the number of votes separating a Biden win from a Trump win would be below 10,000 (0.006% of all votes) with 1.8% probability, below 50,000 (0.03% of votes) with 8.6% probability, and below 100,000 (0.06% of votes) with 14.8% probability. The calculations also required state turnout ratios, which I took from 2020 results reported to date.

      FiveThirtyEight gave Biden a 90% chance of winning, but I found it predicted that the number of votes separating a Biden win from a Trump win would be below 10,000 with 3.7% probability, below 50,000 with 14.1% probability, and below 100,000 with 22.6% probability.

      Values of this type could help readers understand that elections can still be extremely close, depending on less than 0.1% of all votes cast, even when one candidate is heavily favored to win.

      Calculating the values is a little subtle owing to the electoral college, but can be done by identifying the situation as a variant of the 0-1 knapsack problem.

      Last, here is a nice article on how ridiculously close US elections have historically been: https://theconversation.com/the-electoral-college-is-surprisingly-vulnerable-to-popular-vote-changes-141104

        • I’m glad you like the suggestion Carlos, and thank you pointing out the excellent WaPo article.

          Another perspective on this is that people want an election forecast to tell them whether the election is “safe” for their candidate. “Safe” might be defined as the winner of the election remaining the same, even if 1 in 200 voters anywhere switched to voting for the other candidate.

          Analyzing the Economist’s model’s simulations shows that there was only a 56% chance that the election would be “safe” according to this definition, even though they gave Biden a 97% chance of winning. Similarly, FiveThirtyEight’s model give a 49% chance the election will be safe in this way, despite giving Biden a 90% win probability.

          Now the definition of “safe” I used is very debatable! But any better definition will have the same problem — many elections won’t feel “safe” even when a candidate is heavily favored. I think readers of forecast models would benefit from learning this.

  23. The context of this discussion about whether the general public has a mistaken understanding about the degree of uncertainty of election forecasts or polls is the persistent undercounting of supporters of Donald Trump and the supposed “failure” of polls predicting a landslide win for Biden.

    But the implicit assumption is that the polls themselves (which are largely based on traditional pollsters asking individuals questions about political support) are getting reliable and accurate answers from a representative sample.

    Is that assumption even justified?

    Given the polarization of the American public (which include the sweeping generalization of ALL those who voted for Trump as racists, sexists, homophobes, or otherwise “deplorables”), combined with the distrust of many Trump supporters for “elites” or “experts” who they blame (rightly or wrongly) for the ills of the country (and they would include pollsters as among the “elites”), is it not possible that there are people who may either:

    1. Actively lie to pollsters about who they are voting for

    or

    2. Choose not to respond to pollsters?

    • In another words, to what extent does response bias impact traditional election polls, and whether the methods used by traditional polling organizations can be effective in mitigating for this?

  24. Also a question to those in this thread:

    How familiar are all of you to “Polly”, an AI system designed by Canadian company Advanced Symbolics to provide election forecasts and other predictions of people’s behaviour by using aggregate data from social media? There have been news articles indicated that Polly had successfully predicted the 2016 Trump win, the results of the 2016 Brexit vote in the UK, and the election results from a number of Canadian federal and provincial elections.

    Here is a link to the Advanced Symbolics.

    https://advancedsymbolics.com/about-our-ai/

    And here is a CBC news article from Canada about Polly:

    https://www.cbc.ca/radio/day6/no-knock-warrants-monitoring-the-u-s-election-ai-pollsters-west-wing-reunites-bts-stock-and-more-1.5763944/meet-polly-the-ai-pollster-that-wants-to-predict-elections-using-social-media-1.5763952

    • Well, they forecast Biden 372 so…

      In the end I don’t think there’s enough data to build good models like that. There’s just not enough elections in the era of social media.

Leave a Reply

Your email address will not be published. Required fields are marked *