So . . . what about that claim that probabilistic election forecasts depress voter turnout?

A colleague pointed me to this article by Sean Westwood, Solomon Messing, and Yphtach Lelkes, “Projecting confidence: How the probabilistic horse race confuses and demobilizes the public,” which begins:

Recent years have seen a dramatic change in horserace coverage of elections in the U.S.—shifting focus from late-breaking poll numbers to sophisticated meta-analytic forecasts that emphasize candidates’ chance of victory. Could this shift in the political information environment affect election outcomes? We use experiments to show that forecasting increases certainty about an election’s outcome, confuses many, and decreases turnout. Furthermore, we show that election forecasting has become prominent in the media, particularly in outlets with liberal audiences, and show that such coverage tends to more strongly affect the candidate who is ahead—raising questions about whether they contributed to Trump’s victory over Clinton in 2016. We bring empirical evidence to this question, using ANES data to show that Democrats and Independents expressed unusual confidence in a decisive 2016 election outcome—and that the same measure of confidence is associated with lower reported turnout.

The debate

My colleague also pointed me to this response by political analyst Nate Silver:

Not only that, but none of the evidence in the paper supports their claims. It shouldn’t have been published.

The experiment finds that people are *underconfident*, not overconfident, when they see probabilistic forecasts. It directly contradicts the central tenet of their hypotheses.

In reply, Matt Grossman replied:

Reconciliation is people are bad at estimating win probabilities from vote share & vice versa. So they interpret a win probability of 75% as a landslide (even if they reduce it to 65% in their head) & see an 8 point lead as near 50/50. Turnout responds to perceptions of a toss-up.

To which Nate responded:

That’s the most generous interpretation and it still only rises the paper to the level of “it’s possible the evidence supports rather than contradicts our hypothesis if you introduce a number of assumptions we didn’t test (and ignore a lot of other problems with our experiment)”. . . . I think it’s unethical for them to make such strong claims with such weak evidence. . . . There are many other critiques I have of the experiment, including the lack of context for how election results are framed in the real world (i.e. people see poll results with headlines attached and not just numbers.) But that’s the main flaw and it’s a fatal flaw.

And Oleg Urminsky wrote:

I disagree with the main implication, based on my concurrent research with Lucy Shen . . . First, BOTH probability & margin forecasts are misunderstood. People UNDERestimate closeness of the outcome w/ probability forecasts, but OVERestimate w/ margin forecasts. . . . Second, the degree of misestimation is SMALL when the margins are small (i.e., a close election) and people really misestimate only when the margin is large. But when the margin is large, any resulting bias would have to be huge to actually change outcomes. . . . Third, we looked and found NO effect on voting intentions or other reported election behaviors. . . . So, while we think the fact that forecast framing affects people’s judgment is fascinating enough to have researched it for 3.5 years now, I disagree with the “probability forecasts depress turnout” takeaway.

Many other people participated in this thread too; you can follow the links and read the back-and-forth if you’d like.

Where do we start?

Before going in and looking at the evidence, I come into this question with a mix of contradictory prejudices.

To start with, I admire Nate’s political analysis and his willingness to accept uncertainty (most famously in the lead-up to the 2016 election). At the same time, I’m annoyed with his recent habit of drive-by criticism, where he engages in some creative trash-talking and then, when people ask for details, he disappears from the scene. Yeah, I know, he’s a journalist, and journalists are always on to the next story, they’re all about the future, not the past. But as an academic, I find his short attention span irritating. If he’s not going to engage, why do the trash-talking in the first place.

I also have mixed feelings about the sort of research being discussed. On one hand, I’m on friendly terms with Messing and Lelkes, and I think they’re serious researchers and they know what they’re doing. On the other hand, I’m generally suspicious of claims about irrational voters. On the third hand, I am on record as saying that people should be more likely to vote if an election is anticipated to be close (see section 3.4 of this article).

Getting to more of the specifics, here’s what Julia Azari and I wrote, following the 2016 election:

We continue to think that polling uncertainty could best be expressed not by speculative win probabilities but rather by using the traditional estimate and margin of error. Much confusion could’ve been avoided during the campaign had Clinton’s share in the polls simply been reported as 52 percent of the two-party vote, plus or minus 2 percentage points. That said, when the general presidential election is close, the national horse race becomes less relevant, and we need to focus more on the contests within swing states, which can be assessed using some combination of state polls and state-level results from national polls. An additional problem is the difficulty that people have in understanding probabilistic forecasts: if a prediction that Clinton has a 70% chance of winning is going to be misunderstood anyway, why not just call it 98% and get more attention?

So on the substance I’m in agreement with Westwood et al. that probabilistic forecasts are a disaster.

Consider that hypothetical forecast of 52% +/- 2%, which is the way they were reporting the polls back when I was young. This would’ve been reported as 52% with a margin of error of 4 percentage points (the margin of error is 2 standard errors), thus a “statistical dead heat” or something like that. But convert this to a normal distribution, you’ll get an 84% probability of a (popular vote) win.

You see the issue? It’s simple mathematics. A forecast that’s 1 standard error away from a tie, thus not “statistically distinguishable” under usual rules, corresponds to a very high 84% probability. I think the problem is not merely one of perception; it’s more fundamental than that. Even someone with a perfect understanding of probability has to wrestle with this uncertainty.

Where do we stand?

OK, to assess the evidence I have to read the two above-linked articles: the one by Westwood, Messing, and Lelkes, and the one by Urminsky and Shen.

So now to it.

Westwood, Messing, and Lelkes start by mentioning the rational-choice model of voting: “if P is the (perceived) probability of casting the decisive vote, B is the expected benefit of winning, D is the utility of voting or sense of ‘civic duty,’ and C is the cost of voting, then one should vote if P × B + D > C.”

One thing they don’t mention, though, is the very strong argument (in my opinion) that the “benefit term,” B, can be very large in a national election. As Edlin, Kaplan, and I discussed, voting is instrumentally rational to the extent that you are voting for a social benefit, in which case B will be proportional to the number of people affected by the vote, which is roughly proportional to the number of voters in the election. The probability P is roughly inversely proportional to the number of voters in the election (more evidence on this point here), and when you multiply P x B, the factors of N cancel, hence the anticipated closeness of the election is relevant, even for a large election.

We also discuss how anticipated closeness can affect turnout indirectly. If an election is anticipated to be close, we can expect more people to be talking about it, so voting will be more appealing as a way of participating in this talked-about communal event. In addition, if an election is anticipated to be close, we can expect more intensive campaigning, so as a voter you will get more encouragement to vote.

So, lots of reasons to expect higher turnout in a close election. It does seem plausible that a forecast such as “Clinton is at 52% in the polls, with a margin of error of 4 percentage points” gives more of a sense of uncertainty than a forecast such as “Clinton has an 84% chance of winning.” Both these are oversimplifications because they ignore the electoral college, but they give the basic idea.

In their paper, Westwood et al. give evidence that many people underestimate the closeness of an election when they are giving a probabilistic forecast. I can’t quite see why Nate says that their experiments reveal that people are “underconfident, not overconfident, when they see probabilistic forecasts.” There’s a lot in that paper, so maybe that’s somewhere, but I didn’t see it. Too bad Nate didn’t point to any specifics. Then again, he never pointed to any specifics when he was criticizing MRP, either.

Now to the paper by Urminsky and Shen, “High Chances and Close Margins: How Equivalent Forecasts Yield Different Beliefs,” which begins:

Statistical forecasts are increasingly prevalent. How do forecasts affect people’s beliefs about corresponding future events? This research proposes that the format in which the forecast is communicated biases its interpretation. We contrast two common forecast formats: chance (the forecasted probability that an outcome will occur; e.g., the likelihood that a political candidate or a sports team will win) versus margin (the forecasted amount by which an outcome will occur; e.g., by how many points the favored political candidate or sports team will win). Across six studies (total N = 2,995; plus 12 replication and generalization studies with an additional total N = 3,459), we document a robust chance-margin discrepancy: chance forecasts lead to more extreme beliefs about outcome occurrences than do margin forecasts. This discrepancy persists over time in the interpretation of publicly available forecasts about real-world events (e.g., the 2016 U.S. presidential election), replicates even when the forecasts are strictly statistically equivalent, and has downstream consequences for attitudes toward election candidates and sports betting decisions. The findings in this research have important societal implications for how forecasts are communicated and for how people use forecast information to make decisions.

Ummm . . . wait a second! Urminsky and Shen are not in contradiction with Westwood et al.! Both papers say that if you give people probabilities, they’ll have more extreme beliefs.

After reading Urminsky’s tweets (quoted above in this post), I was all ready to read the two papers and figure out why they come to opposite conclusions. But now I’m off the hook: the papers are in agreement.

As the button says, That was easy.

One thing I will say, comparing the papers, is that Westwood et al. have a bunch of excellent graphs. Urminsky and Shen have some good graphs—I like the scatterplots! Graphs are great. Always do more graphs.

So now that I’ve looked at the two papers, let me return to Urminsky’s remark that “the degree of misestimation is SMALL when the margins are small (i.e., a close election) and people really misestimate only when the margin is large. . . . we looked and found NO effect on voting intentions or other reported election behaviors. . . . I [Urminsky] disagree with the “probability forecasts depress turnout” takeaway.”

He’s making two points here: (a) misperceptions deriving from probabilistic forecasts are small, (b) there were no effects on turnout. I’ll now reread his paper with these two issues in mind.

Let’s start with Study 1 of the Urminsky and Shen paper, which they conducted during the 2016 election. It was a study of 225 people on Mechanical Turk. First, “Participants in the chance-forecast-displayed condition had more extreme reactions to the state of the election conveyed by the forecast than participants in the margin-forecast-displayed condition . . . these results suggest that chance forecasts are seen as conveying a stronger lead than margin forecasts convey. . . . We replicated this finding in an additional study (Study A1 in Appendix 1) . . .” So far, no evidence that the misperceptions were small.

On to Study 2, this time based on 1163 Mechanical Turk participants. Here’s what they found: “participants in the chance-forecast-displayed condition . . . overestimated the margin forecast (60.5% estimated vote share for Clinton vs. 52.6% actual forecasted vote share . . .). Participants in the margin-forecast-displayed condition . . . underestimated the chance forecast . . .” Thinking Clinton was going to get 60% of the vote: that seems like a large misestimation, so I’m not sure why Urminsky labeled it as “SMALL” in his tweet.

What about the results on voting intentions? Studies 3 and 4 of the paper are about sports betting, study 5 is about movie ratings, and study 6 was about drawing balls from an urn. That’s it. But I kept reading, and in the discussion section I found this:

When only one forecast format is widely available, our findings suggest that a systematic bias may result. Could this bias affect election results? As we demonstrated that the forecast format can affect attitudes, it could plausibly affect intention to vote, as well. In particular, if chance (vs. margin) forecasts leave readers with a stronger sense that the election has already been decided, showing chance forecasts might demotivate voters (Westwood, Messing & Lelkes, 2018).

However, a presidential election is a high-profile event involving substantial news coverage, personal conversations, and other sources of information and preferences beyond forecasts. As a result, many people are likely to have formed behavioral intentions about whether or not they will vote (as well as about other election-related actions) prior to viewing forecasts, so they should have limited sensitivity to manipulated cues . . .

That makes sense, but it’s just theory, not data. They continue:

In considering the potential impact on elections, it is important to take into account that across our studies, the forecast-format bias was weakest when the margin was narrow (e.g., forecasted election results in Study 2). . . .

But I didn’t think that bias was so small! Thinking Clinton would get 60% of the vote—that’s a big number, it’s a Reagan-in-1984-sized landslide.

But then they come to the data:

More generally, we tested the potential impact of forecast format on intended election behaviors directly in Study 1, in additional election studies in which people were presented with information about changes in chance or margin forecasts over time . . . Forecast format yielded only a non-significant difference in the self-reported likelihood of voting . . . Furthermore, we did not observe a stronger effect of format on behavioral intentions (including voting) among participants living in states where the state-level presidential election was closer, and respondents’ votes therefore were more likely to be pivotal . . .

But, just cos something’s not statistically significant, that don’t mean it’s zero. Also, interactions are notoriously difficult to estimate, so you really really can’t learn anything from the non-statistical-significance of the interaction.

This study was based on only 198 survey respondents. You can learn a lot from 198 people, but not so much in a between-person comparison of a highly variable measure.

To their credit, Urminsky and Shen continue:

These results do not rule out the possibility that our sample size was not large enough to detect a small but real effect of forecast format on voting intention.

The question is, what is “small”? For example, if the people who followed probabilistic forecasts were 5% less likely to vote, and if 2/3 of these people were Democrats, that could make a difference.

Summary

1. The two studies (by Westwood/Messing/Lelkes and Urminsky/Shen) agree that if you give people a probabilistic forecast of the election, they will, on average, forecast a vote margin that is much more extreme than is reasonable.

2. I see no evidence for Nate Silver’s claim that people are “underconfident, not overconfident, when they see probabilistic forecasts.” Nor do I agree with him that “none of the evidence in the [Westwood et al.] paper supports their claims” or that “it shouldn’t have been published.” It makes me lose some respect for Nate that he said this and then didn’t follow up when he was called on it. But it’s not too late for him to either retract this statement or justify it more clearly.

3. Regarding the larger question of whether this sort of thing can swing an election: I don’t know. Like Urminsky, I’m generally skeptical about statistical claims of large effects on behavior from small interventions. First, I’ve just seen too many such claims; second, there really are a lot of things affecting our voting behavior, and it’s hard to see how this one little thing could have such a big effect. On the other hand, in a close election it wouldn’t take much. And I don’t think that Urminsky and Shen’s null finding on voting behavior tells us anything; it’s just too noisy an estimate. So, again, I just don’t know.

4. From a journalistic point of view, the story is just too delicious: the irony that a bunch of political junkies could depress their own turnout by following the news too closely. I’ve argued many times that probabilistic forecasts are overprecise and can lead to loud disputes that are essentially meaningless, and I’ve also discussed the perverse incentives by which probabilistic forecasters have incentives to make their numbers jump around so that they can keep presenting news. So, yeah, if all that artificial news can convince political junkies not to go out and vote . . . that’s the sort of counterintuitive story that can get headlines.

P.S. I agree with this post by Palko that news media horserace coverage is a joke: the partisan media is obviously spewing bias, but much of the nonpartisan media is also printing attention-grabbing crap. So the big picture is that it’s not clear how the voter is supposed to handle this information. Vote margins or win probabilities are the least of the issues.

62 thoughts on “So . . . what about that claim that probabilistic election forecasts depress voter turnout?

  1. Thanks for the nice writeup of the debate.

    Just to clarify a few things:

    First, we’re not saying there is no effect, just that strong evidence for the claim that biased perception impacts voting is not there. We find no significant effect on voting intentions (or other voting-related behavioral intentions) both in Study 1 (small sample) and in a meta-analysis of all four studies in which we collected voting intentions (N=854, p=.13, Table A1 in Appendix). It’s worth noting that Study 1 of WML also finds no significant effect on voting intentions! So, we see it as an intriguing possibility that is not established by their or our data. Of course, an effect that is not significant with ~1000 people might (or might not) be actually there with millions of actual voters.

    Second, in thinking about the effect on voting intentions, our point is that there is a tradeoff between magnitude of the perceptual effect and impact of voters abstaining, if they are affected. When a race is very close, the perceptual effect is relatively small (probability forecasts don’t make much of a difference to beliefs) and the effect of abstaining is big; when the race is not close, the perceptual effect is large (probability forecasts look very different than margin forecasts) but it would take a lot of abstaining to shift the election. This dampens the impact of the perceptual effect on voting.

    Third, in thinking about whether probability forecasts are bad, it’s important to recognize that margin forecasts are also bad, but in the opposite direction: people seeing margin forecasts underestimate the probability of an outcome. So, if the goal is to foster accurate beliefs (as opposed to inaccurate but vote-motivating beliefs) then margin forecasts alone are also a problem.

    Oleg

    • Thanks for engaging. Just to reiterate, in study 1 of WML 93% of respondents say they would vote. If almost everyone says they’re going to vote, you can’t detect an effect. That’s what we mean we say “talk is cheap” and voting is not. When we impose a cost of voting as we do in the game, the voting rate goes down and we actually see effects.

    • Actually reading your paper…. I have a *huge problem with it*.

      “We presented participants with either a chance forecast (probability of election outcome) or a presumably equivalent margin forecast (predicted vote share) and asked participants to rate their attitude toward the forecast news.

      Participants (N=225) valid participants from Amazon Mechanical Turk, or AMT,after excluding those with duplicate IP addresses and failed attention checks, which was done in all the studies) saw an election forecast that displayed either the chance of Clinton winning (i.e., the chance-forecast-displayed condition) or Clinton’s predicted margin of victory (i.e., the margin-forecast-displayed condition).”

      The chance forecast and the margin forecast are *not* equivalent. The margin forecast *can be* equivalent under the assumption that the uncertainty in the margin is known, but on its own, knowing that the polling aggregate gives Clinton 51.5% confers *no information about the probabilities*, even if we ignore the electoral college.

      Did you provide the participants with the vote predictive distributions?

        • Yes, we agree and this is discussed and addressed later in the paper. In the first two studies, we use election forecasts that fivethirtyeight.com presents as equivalent. although of course we can’t confirm that. In the last two studies, we generate forecasts that are statistically equivalent.

  2. A forecast should reflect the fact that the forecast itself will be made public. The forecaster should be responsible for their estimate of that impact.

  3. I think the debate as described may be missing something. In my experience, people tend to CONFUSE probabilistic estimates with margin estimates to some extent, probably because of a long history of dealing with margin estimates. So, as Andrew mentioned, people hear 60/40 and think it’s an absolute blowout, because they don’t understand that what they are hearing is a probabilistic estimate, not poll results like they are used to.

    I think it’s interesting that this would depress turnout on the higher-probability side (Clinton in 2016) more than the lower-probability side (Trump in 2016). I didn’t see anything in the excerpts here about why that would be or how the data supports that, and unfortunately I don’t have free access to the paper.

    • “I think it’s interesting that this would depress turnout on the higher-probability side (Clinton in 2016) more than the lower-probability side (Trump in 2016). ”

      Yes, an interesting question. One possibility would be that the “psychological effect” might differ according to the side — for example (as admittedly a wild conjecture), with a 60/40 estimate, those on the estimated “winning side” may react with “an absolute blowout” attitude, and so feel no need to vote if they haven’t already, whereas those on the estimated “losing side” might have a gut reaction like, “We’re the underdog right now, but it’s not a really wide margin, so we need to rally and do all we can do to support our candidate, so we need to get out the vote on our side.”

      • Hmm — my last sentence seemed long, so I did a quick count and came up with more than 100 words. (But at least I didn’t have any nested parentheses (which I have on occasion been known to do).) (Does the sentence immediately preceding this one qualify as self-referential?) ;~)

    • It is a pet peeve of mine when I hear “people tend to CONFUSE probablistic estimates with margin estimates to some extent” Ordinary people are not confused. Statisticians have made a mess of language. What is a probability estimate for an event that only happens once? While I understand that Silver and others are building a simulation and running it many times, most people don’t and it is not explained that way. Usually, the estimate is given without sufficient explanation, people naturally think of probability as the odds that an event will occur over some distribution of possibilities. But, when a probability is attached to an event that, as far as they know, is either going to occur or not once, the statement losses meaning not because regular people are confused, but because the statement is confusing. They then translate it into something that makes sense to them. I am not CONFUSED when people speak French to me. I just don’t understand French. If they waiter asks me, “More pain.” I say no even though I like bread.

    • Andrew (not the author), Martha:

      The argument in that paper is that turnout would be depressed among potential voters who follow probabilistic forecasts, and they present evidence that more Democrats than Republicans follow probabilistic forecasts.

  4. Suppose a potential voter actually understands the odds their vote is pivotal, perhaps because they’ve read Andrew’s papers on this topic. Surely those people wouldn’t be prompted into getting their turnout decision wrong (relative to a full understanding of all uncertainties) by a probability forecast of the election. If they stay home after checking the 538 forecast, it will be because P × B + D is now actually less than C for them (They will also have treated Silver’s forecast as merely one piece of evidence among several when forming belief P). So the turnout effect of probability forecasts does not harm these people. It benefits them.

    But suppose a potential voter doesn’t understand the odds their vote is pivotal. They think the odds are much better than they actually are. Those people may decide to stay home when 538 tells them Trump wins with probability 85% but they would’ve voted when they hear Trump 52% +-2, because they translate these into a 1/10000 and a 1 /1000 chance their vote matters. If so, you could say their misinterpretation of the probability forecast moved their turnout decision towards what it would be if only they understood the uncertainties, and surely that’s a good thing.

    If the effect on turnout is real, is anyone made worse off because they are encouraged to make a decision they would’d make if only they were fully informed?

  5. Andrew writes: “One thing they don’t mention, though, is the very strong argument (in my opinion) that the “benefit term,” B, can be very large in a national election. As Edlin, Kaplan, and I discussed, voting is instrumentally rational to the extent that you are voting for a social benefit, in which case B will be proportional to the number of people affected by the vote, which is roughly proportional to the number of voters in the election.”

    I think that you can go further. If voting is instrumental only in terms of individual benefit, the vast majority of rational people simply wouldn’t do it. Turn-out would be incredibly low. So, if voters are rational at all they are maximizing a social benefit. I agree with you that stories about the “irrational voter or irrational agent in general are suspect. I can only impute the agents preferences and decision process from his observable behavior. To posit, that an agent is rational just means that his or her actions are consistent in some manner. But, there are many different ways to model the behavior as being rational. You can change the payoffs or like you have done change what the agent is maximizing. To say that an agent is irrational is just to say that he doesn’t act consistently. But then the agent becomes completely unpredictable. To say that an agent makes errors only makes sense if the agent is rational in some sense such that they actions can be described as consistent most of the time under some theory of rationality and then occasionally deviates from that rational decision making process. If the agent is always irrational, his behavior follows no consistent decision making process and it makes no sense to say the agent is committing an error.

    • > If the agent is always irrational, his behavior follows no consistent decision making process and it makes no sense to say the agent is committing an error.

      This is saying “only random is irrational”.

      By this logic, the person who consistently refuses to leave their house because they believe space aliens will beam eggs into their chest cavities to burst forth later like in the movie Alien, but that the walls of their house block the beams… is rational.

      I think we need to go further than “doesn’t act like a random number generator” to qualify as rational. It must also be the case that their views are based on some kind of realistic assessment of reality.

      • My point is typically when we say that an agent is acting rational we just mean if they prefer a to b and b to c, then they prefer a to c. We can’t directly view an agent’s preferences. We can only impute them from their actions. To impute their preferences we have to assume that their preference function is transitive. If I don’t make that assumption, then I cannot impute any preferences to the agent because their preference function will be inconsistent and everything follows from an inconsistency. If I say that it is possible that you prefer apples to oranges and oranges to lemons and lemons to apples, then I can’t infer from you actions what you are going to do. Irrational does not mean random It does imply inconsistency, which makes it impossible to impute preferences.

        Your example of the aliens is wrong. If someone really believes that aliens are going to get them if they leave the house and they don’t want to be taken by aliens, then it is perfectly rational to stay inside. But, we are just talking about instrumental rationality. I agree that a broader sense of rationality exists and is important. However, having false beliefs in and of itself is not sufficient for any type of irrationality.

        • Why should human preferences be transitive? OC, if they can be reduced to numbers, they are. But there are many things that are only partially ordered, and which, therefore cannot be reduced to numbers. Keynes thought that probabilities are only partially ordered. We know that human preferences are not always transitive. We cannot conclude from that that they are irrational. For a good treatment of partially ordered values, see combinatorial game theory.

        • Essentially no-one but an economist would say that the person who believes in alien space rays was rational.

          The problem with the alien space ray person is that they don’t accept the meaning of everyday evidence. ie… there are thousands of people that they can see who *don’t* have aliens jumping out of their chests even though they have been exposed to the outdoors for decades.

          The person makes a rational type decision based on their beliefs, I agree with that, but their beliefs are irrational because they are not based on reality.

          I would say a person is rational just to the extent that they have good reality based reasons for their beliefs, they exhibit stable and consistent preferences, and they take actions which could reasonably be expected to lead to improving the achievement of their preferences considering their limited knowledge and skill. It’s gotta be the whole package.

        • I agree. There are many senses of being rational. I will try to make my point differently. If you say an agent is irrational, you can only really mean that they are being irrational sometimes. Because for us to observe that the agent is making a wrong decision, we must impute to the agent that it wants something or has some end that it is trying to advance in a consistent way. If I don’t posit that its actions are consistent with it pursuing some goal (whatever that goal be), then there is no sense in saying the agent is being irrational. Agents that are wholly irrational are not making errors. They don’t have any consistent goal they are pursuing. So, perhaps I am attacking a strawman, but when I hear discussions about the irrational voter or the irrational consumer, I often think the whole discussion is confused because evidence that humans make errors can be interpreted as evidence against one particular model of rationality or it can be interpreted as evidence that rational agents make mistakes under certain circumstances. But, it can never be interpreted as evidence that the agents are wholly irrational.

        • “Because for us to observe that the agent is making a wrong decision, we must impute to the agent that it wants something or has some end that it is trying to advance in a consistent way.

          It is so. If we don’t know the agent’s goals – and know their knowledge – we can’t assess the rationality of their actions.

          But I insist that knowledge is the key factor. “Rational” behavior is using knowledge and logic to achieve some goal. Consistency is a function of what the agent thinks they know at the time of any given action.

  6. Andrew, one problem here is that Nate’s probabilistic forecast cannot be mapped to 52% +/- 2%. One of the reasons why Nate did better than others in 2016 was the incorporation of correlated errors in state polling, taking in the effect of the electoral college.

    I suppose one might do a 549 seats +/- 100 or whatever, but the posterior distribution of seats is frequently non-unimodal.

    Tangentially, I recall Nate making a statement “these outcomes are equally probable – a Trump win, A narrow Clinton win, a Clinton landslide”, and it being received as “Nate is saying he has no idea what will happen”…

    • Hey Zhou,

      If you go to https://projects.fivethirtyeight.com/2016-election-forecast/#odds and click the “Electoral Votes” tab, you will see 538’s electoral vote share projection accompanied by an 80% CI.

      We explain how the vast majority of probabilistic forecasts can be mapped to a “share of ” [something] projection in the paper, whether you rely on the normal distribution—as Andrew did—or a simulation approach—as 538 is thought to.

      However, that said, mapping 52 +/- 2 to an 84% probability may be a touch unfair to forecasters. They will be estimating things based on many surveys and have a *far* better (and less biased) estimate of the MoE than the usual survey (which tends not to account for total survey error). It will look more like what’s in our paper, which is ~ 67% chance.

    • Zhou:

      Nate definitely has received some stupid criticism by people who don’t know what they’re talking about! I don’t envy him that, and I guess he can have difficulty filtering the sensible criticism from the vast mass of crap that he has to deal with.

    • Zhou

      It’s possible to say: between 280 to 400 seats, with an average of 350 (or whatever), in the non-unimodal case. This is really a visualization/presentation problem that there are solutions for.

  7. A couple of points.

    1) On the irrationality of the average adult with regard to probabilities. There was research many years ago — things may have changed, I haven’t keep up — that indicated that one reason people made mistakes with probabilities was that they made mistakes with fractions and percentages. They are much better with integers.

    So what if election forecasts were made in terms of integers? Instead of saying that Bert has an X percent chance of winning, or that Bert leads Ernie by Y percent, say that we project that Bert will win by Z votes. Note that this is done in retrospect in close elections. E.g., Bert carried Paducah by 1713 votes. Suppose that before the polls closed the forecasters said that they projected that Bert was leading by 5,000 votes, ± 2,000. An Ernie supporter might think that if she and a few thousand other Ernie supporters turned out, they might be able to swing the election to Ernie. OC, given a percentage lead she could draw the same conclusion, but not if she has trouble with percentages. And given only a projected probability that Bert would win, what conclusion could she draw?

    2) A philosophical point about game theory, rationality, and the value of one’s vote. In any large election, the probability that I will cast the deciding vote is effectively infinitesimal. On that basis I decide that it is not worth my while to vote. But what if everybody else thought as I did? Then they would not vote, and I could cast the deciding vote. In that case the value of my vote would be huge. But what if everybody else engaged in the same reasoning? Then they would vote, and my vote would have little to no value. This Kantian line of reasoning has no end. I conclude that the probability that I will cast the deciding vote is not, therefore, of any use in making a rational decision about whether I should vote, given only a large electorate. The very fact of the election means that somebody’s vote has value, and then so does everybody’s vote. What matters to my decision is the value of the outcome to me. Whether I cast the deciding vote is not a factor.

    However, suppose that I know how everybody else has voted, or will vote. Then, in all likelihood the value of my vote is 0. That means, then, that if I have information about how others will vote, it may well affect the value of my vote. Polls and projections do give me such information. And that information will often reduce the value of my vote without that information.

    • I don’t think your game theory analysis works. All that has to happen to stop the backward induction is for one voter to believe a significant number of people will vote either because of some large benefit to them or because they are irrational. So, I doubt that I will see any benefit from the next election or only a very small benefit, but I know that some small group of insiders (people who will get appointments from the next president or who have business deals that depend of specific government approvals have a big interest). So, at the first stage, I think my voting power and everyone else’s voting power is so low that no one should vote, but then I think if everyone knows that and therefore doesn’t vote, my voting power shots up, but so does everyone else, but I still get very little personally from voting while there are a bunch of people who get a lot personally. So, in this scenario, I know my voting power will be once again reduced and my payoff worse than others, so I should sit home. Likewise, if I believe that a significant number of people can be fooled into voting because either by being fooled about the size of the likely payout or confusion over their voting power, then a large group will vote and I should sit home. In either case, if voters act instrumentally just to benefit themselves, voting turn out should be extremely low. So, I think Andrew’s thesis that people are voting for a social benefit not an individual benefit is solid.

    • Bill:

      See this paper for some theory about the rationality of voting. Short answer is that I think the statement, “In any large election, the probability that I will cast the deciding vote is effectively infinitesimal.” The probability can be of order 1 in a million, which is small, but distinct from zero when it is multiplied by the very large number that represents the stakes in the election. Also in that paper we work out the equilibrium from the game theory analysis (see Figure 1).

      • Yes, odds any individual influences an election are ~zero. But the psychology lit shows people don’t think that way. They reason subjectively and imprecisely, especially about one-off, low probability events—odds of winning the lottery are similarly small & people still play

        • @Solomon

          Oh, I think that everyday people are well aware that the chance that their vote will not itself decide the election. But I don’t think that they therefore reason subjectively and imprecisely about voting. (OC, people do reason subjectively and imprecisely.) As I indicated, they follow the Kantian precept of universalization, whether they have heard of Kant or not. They are well aware of the question, having heard it from childhood, “What if everybody did that?” They form an implicit alliance with other voters who share the same views on the outcome of the election, and they also universalize to other voters, as well. To win the election (the game), their side has to vote, and on the assumption that the other side or sides will vote, enough people on their side have to vote. It is not just that, if they are civic minded, they view the payoffs of the election broadly, they do not regard the decision whether to vote or not in purely individualistic terms, at least, not if they care about the results of the election. (OC, I am not talking about everybody.)

          Suppose that you have an election with equal numbers on each side. The voters on one side think individualistically, objectively, and precisely, and a fraction of them do not vote, while the voters on the other side think in terms of alliances, subjectively, and imprecisely, and they all vote. Their side wins the election. The other side can console themselves with having been rational.

    • Zad:

      No, I emailed him but he did not reply. But he’s so busy, I have no idea if the email even reached him. I really can’t see why he does these attacks if he’s not interested in following through. Perhaps it’s as I wrote in my above post, that he writes these tweets just to blow off steam. Maybe he’s surprised or amused that anyone takes these tweets seriously!

        • Zad:

          I followed that link and saw this from Nate:

          First, what is a demographic regression? To infer a candidate’s standing in all 57 states and territories at any given time, our model calculates a series of demographic regressions based on (i) the results of states that have voted so far and (ii) the polls in states where we have abundant polling. For instance, the regressions can figure out that Biden is strong in states with a large African American populations, and that Sanders is strong in liberal states. These demographic regressions are then combined with a geographic prior based on candidates’ home states and regions (for example, Sanders is assumed to be strong in New England). The result is then used as a substitute for polling in states where there are no polls and as a complement to the polls in states where there isn’t much polling. Nevada and South Carolina, for instance, have a fair amount of polling but not as much as the model would like, so the regression gets a small amount of weight in our forecasts.

          “A demographic regression . . . combined with a geographic prior . . . is then used as a substitute for polling in states where there are no polls and as a complement to the polls in states where there isn’t much polling” . . . that sounds exactly like MRP!

          So now I’m really confused.

          On one hand, I’m glad that Nate is using MRP or something similar. On the other hand, now I can’t see why he’s so negative about it. Is it just the brand name? If, instead of calling it “multilevel regression and poststratification,” he were to call it “demographic regression, geographic prior, used as a substitute or complement” (DRGPSC), would that be ok? I’m genuinely baffled. Maybe someone can explain?

        • Andrew, I guess he just might not care in being careful with his statements/criticisms. The only time I saw him being really serious about someone’s criticisms was when he was feuding with Nassim Taleb on Twitter

        • Zad:

          it’s my impression that Nate takes seriously what he writes on his website, but he doesn’t take seriously what he writes on twitter. In contrast, I feel like Taleb and I take everything we write seriously. That doesn’t mean that Taleb and I are always right; it’s just that we’re not in the habit of saying things we won’t stand behind.

      • I think it’s irresponsible for Silver to let off steam on this topic in particular without anonymizing himself. He’s very publicly considered an authority on election forecasting, maybe THE authority on election forecasting. Whether he’s taking himself seriously or not, other people are going to take him very seriously and, knowing that, he’s obliged to be careful about what he says.

        Hayek on the Nobel Prize:

        It is that the Nobel Prize confers on an individual an authority which in economics no man ought to possess. This does not matter in the natural sciences. Here the influence exercised by an individual is chiefly an influence on his fellow experts; and they will soon cut him down to size if he exceeds his competence. But the influence of the economist that mainly matters is an influence over laymen: politicians, journalists, civil servants and the public generally. There is no reason why a man who has made a distinctive contribution to economic science should be omnicompetent on all problems of society – as the press tends to treat him till in the end he may himself be persuaded to believe. One is even made to feel it a public duty to pronounce on problems to which one may not have devoted special attention. I am not sure that it is desirable to strengthen the influence of a few individual economists by such a ceremonial and eye-catching recognition of achievements, perhaps of the distant past. I am therefore almost inclined to suggest that you require from your laureates an oath of humility, a sort of hippocratic oath, never to exceed in public pronouncements the limits of their competence.

  8. Some scattered thoughts about the Westwood et al paper:

    1. They found (page 15) displaying a margin of error for the margin display had *no impact* on the judgements of the state of the race and/or certainty. This is a deeply alarming finding, because it fundamentally undermines any theory that that they are showing *equivalent* results. If we accept this result, purely by adjusting the margin of error (which is then ignored by margin viewing users), we can construct an experiment where theoretically, the margin viewers *vastly* overstate or understate the uncertainty of the result relative to the probabilistic. The direction of the effect is essentially at the contrivance of the experimenter.

    2. Nate’s claim that the study shows that users are underconfident with probabilistic forecasts comes from the exceedingly confusing graph at the bottom of figure 5. Comparing the yellow line with the steep diagonal black actual probability of win line, we observe that the slope is closer to the horizontal – implying that users shown a probability of victory discount the probability of victory towards 0.5 for winning candidates, and overestimate the probability of win for losing candidates. (Albeit, to a lesser extent than they do for margin viewers).

    One might argue that this is rational bayesian behaviour (what if 538 messes up?), but Nate’s probabilistic model contains components for model failure. Therefore there’s some degree of double counting happening here. In any case, this point seems debatable – people will have a different definition of ‘close’ election on a basis of margins, vs win probabilities.

    3. I wonder if there’s a reconciliation in that if you ask people the vote margins when they were given a probabilistic forecast, people are treating the question as a *conditional prediction*. That’s to say, they believe they are being asked what the vote margin will be, *if* the high-probability candidate wins. In this case, in the situation of an uncertain election, high degrees of uncertainty map correctly to an increased expected vote margin.

    4. Their second study contrives a situation where the rational decision is to not vote. In this experiment there is a substantial cost to voting, and a fairly low benefit if their team ‘won’ (4x the cost of voting). Extrapolation from this to the general election is a very dangerous step – especially if the thesis statement of the authors is that probabilistic results ‘confuse and depress turnout’, implying the rational decision is to vote. A different interpretation of their findings would be that probabilities ‘partially clarified’ the true decision space the participants were in and the participants made fewer errors (but still some).

    • On point three, a chunk of evidence for the conditional prediction interpretation is that when provided with a win probability of 50%, participants estimated vote share well in excess of 50%. Indeed at a win probability of merely 30% for candidate A, participants were predicting an average of 50-50.

      In the survey they asked the question about vote share after they asked the question about win certainty. They should have randomised the question order.

    • Hey Zhou:

      1. Before I respond directly, your comment raises a critically important point—that probabilistic forecasts are **incredibly** sensitive to assumptions you make about the error. This may be why so many forecasters (emphatically not 538) in 2016 had estimates in the high 80s and even 90s—they failed to account for additional sources of error, including correlated errors and sampling bias. Going back to your point though, it’s not necessary for results to be equivalent for the paper’s implications to hold. And we do use assumptions about the noise that make our estimates on both vote share and probability of winning very close to what you see in the real world.

      2. I have a **very** different read of Figure 5. Many participants—38% in fact—simply confused vote share and probability, putting down the exact same number for both. So when we ask what’s the probability of winning, many respondents are simply putting down the vote share. Also probably in play for other respondents is the fact that forecasters just had what the public perceived to be a spectacular failure in 2016 (as you’re getting at), and so may indeed discount the forecast. But regardless, our claims are not about “under” or “over” confidence, just increased confidence relative to vote share projections.

      3. Not sure I understand what you’re trying to say here.

      4. Yes, we need a field experiment. But the notion that people behave according to rat choice here is incredibly flawed, the psych lit clearly shows people have problems reasoning about very low probability events—it’s why the play the lottery. This by the way comes out very clearly in the data for study 2. People vote slightly less frequently when a blowout is forecasted, but the effect is definitely not in line with rat choice theory. Note that in a follow-on study, we explicitly manipulated the size of the group participants were playing, which had no effect. Worth noting that the key finding replicated.

      • Thank you for replying Solomon.

        It’s absolutely correct that probabilistic forecasts are sensitive to such things. However the interpretation of margin of victory forecasts is also sensitive. 52+/-2% is not the same as 52+/-8%! Going back to 538’s popular vote forecasts, the numbers you used in your study 1 *do not* match the sorts of posterior error distribution 538 used – using a normal approximation implies that the true 95% CI for the 538 vote nowcast is closer to +/- 6%.

        “But regardless, our claims are not about “under” or “over” confidence, just increased confidence relative to vote share projections.”

        The article isn’t nearly so neutral though. If it’s only the difference that matters, one might equivalently rewrite the article as “vote margin forecasts are generally misinterpreted or ignored by consumers”. Looking at those flat lines, that’s an easy conclusion to make.

        “But the notion that people behave according to rat choice here is incredibly flawed, the psych lit clearly shows people have problems reasoning about very low probability events—it’s why the play the lottery.”

        The psych literature also shows that people respond to priming. Surfacing the low probability of their votes being decisive, and hence the payoffs, in a game where this is the case *should* influence their decisions. It’s why lotteries avoid surfacing the true win probabilities.

        Essentially my conjecture would be something like this:

        1. Users broadly do not understand vote share models. Thus they have no real impact on their decisions, people behave according to their prior beliefs.
        2. Probabilistic predictions however are relatively persuasive. People can fail to map this to a vote share prediction, but only *because they do not understand vote share models*. Provided with probabilistic ideas people behave more rationally, and in some cases this can lead them to vote less, but in others it can lead them to be more politically active (for example, in terms of driving greater political engagement – I’m willing to bet 538 consumers are much more politically engaged than average and that’s at least partly causal).
        3. The effect of polling model reporting to relative to the prior beliefs of the holder. If the holder is starting from a 50% win chance, then a win probability=70% poll will update that more than a vote share. But if the holder is starting from a 90% win chance assumption, then a win probability = 70% result is shocking in a way the equivalent vote share result is not. If this was the case, then a win probability prediction would actually *increase* turnout, by suggesting the race is narrowing.

        • For instance, if we assume (and admittedly this is quite a strong assumption) that that the average American had similar win probability estimates to that of predictit betting markets, there’s an implied 80% chance of Clinton winning in 2016. For Clinton supporters their confidence would be even higher. Present them 538 giving a win probability of 71% and wouldn’t their belief in Clinton winning actually fall? One can easily see a scenario where a 52% vote share result is ignored whereas a 71% win prob actually spurs people to go out and vote.

  9. “So the big picture is that it’s not clear how the voter is supposed to handle this information. Vote margins or win probabilities are the least of the issues.”

    Absolutely!

    So this is a cool and interesting topic and worth more research but the phrase “we show” is seriously overworked in the quoted passage from the paper. Whatever we might think about these effects, with so many intervening issues it seems that to “show” this effect is real should require a far stronger base of evidence.

  10. This feels relevant again. The one thing I’m curious about is if you think the ways in which these types of forecasts and aggregates are covered in the press make for a “small nudge” or if they’re a big one? This particular election had a lot of odd structural nudges and a lot of accommodations, so maybe it’s impossible to tell, but it’d be fascinating to hear what you think.

    • Dan:

      I don’t think that election forecasts depressed turnout this year. For one thing, in surveys about half the respondents thought Trump would win, so it’s not like there was a general perception among voters that the election was in the bag.

      The big turnout story of the 2016 election was the high turnout among Republicans, not just Cuban-Americans in Florida but Republican voters all over the country. I’m just guessing here, but right now I’m inclined to attribute this higher Republican turnout to on-the-ground voter registration and mobilization by Republicans (not matched by Democrats because of the coronavirus) and maybe Republicans being more motivated to go vote on election day in response to reports of 100 million early votes.

Leave a Reply to Solomon Messing Cancel reply

Your email address will not be published. Required fields are marked *