Election forecasting updating error: We ignored correlations in some of our data, thus producing illusory precision in our inferences

The election outcome is a surprise in that it contradicts two pieces of information: Pre-election polls and early-voting tallies. We knew that each of these indicators could be flawed (polls because of differential nonresponse; early-voting tallies because of extrapolation errors), but when the two pieces of evidence came to the same conclusion, they gave us a false feeling of near-certainty.

In retrospect, a key mistake in the forecast updating that Kremp and I did, was that we ignored the correlation in the partial information from early-voting tallies. Our model had correlations between state-level forecasting errors (but maybe the corrs we used were still too low, hence giving us illusory precision in our national estimates), but we did not include any correlations at all in the errors from the early-voting estimates. That’s why our probability forecasts were, wrongly, so close to 100% (as here).

42 thoughts on “Election forecasting updating error: We ignored correlations in some of our data, thus producing illusory precision in our inferences

  1. I think an emerging trend is that the activity of polling itself is increasingly viewed through a partisan lens, with, in particular, Trump supporters being skeptical of the entire enterprise. My hunch is that Trump supporters are less likely to answer pollsters honestly, and less likely to respond at all. This lead to a sort of missing not at random data, and is obviously difficult to model.

  2. My confidence in pundits, pollsters, and political statisticians is really quite low right now. You often, quite rightly, point out the flaws and self-delusions of a lot of psychology research. But is political analysis any better? Sure, the nature of the mistakes is different — more subtle problems, like, let’s say, correlations in errors, rather than things like flagrant forking-paths issues. But in the end, all that matters is the ability to make testable predictions, and the brutal assessment that if meaningful predictions can’t be made, the whole exercise is pointless. At least for the power-pose research and others like it, the end result doesn’t really matter; we can laugh off the papers like we dismiss astrological forecasts. For things like presidential elections, though, lots of us really do care, and the standards of criticism should be much, much higher.

    • +1

      My problem is that too many people dwell too much on the elegance of their methods or the theoretical superiority etc. rather than to focus on predictive accuracy.

      I’ve even heard models criticized because “they don’t use all the data” and in the modelers mind that data is “important. But ultimately, who cares? If you can’t get the predictions right its all futile.

      And here we have sort of the highest stakes predictive opportunity in the world. The best brains, lots of money, tons of data.

      And yet, is this the best we can do?

      • As far as I can tell, 538 did alright. I’d like to see a state-by-state evaluation of their prediction intervals though.

        But still–it goes to show that even mountains of data can’t give us certainty. What’s important is to get the uncertainty estimates right.

    • >”My confidence in pundits, pollsters, and political statisticians is really quite low right now.”

      More and more people are realizing how easy it is for a bunch of well funded and highly trained people to widely and confidently publicize claims that are very, very wrong.

      Also, pretty much all the problems facing pysch also face medical research, which we should all care about. I am more scared of the harm misinformed doctors can cause than most illnesses. That is a shameful situation given the amount of money that has been poured into the medical system. (granted, I am ahead of the curve on that one)

    • Why does it matter that we get the *prediction* for something like an election right? We’re going to find out eventually if it’s a heads or a tails, as we did today. What would have changed if the predictions had come out right (although as someone pointed out, everyone made a probabilistic statement so every outcome was a possible event and so nobody was really wrong). I’m sure it was important for people betting money on this or that outcome, but in what other sense does it matter that someone’s binary prediction about who will win or lose tomorrow came out right?

      And we already knew that statistical models suck at predicting anything, especially the future.

      • There’s a long answer I could give about why the prediction matters, which perhaps I’ll get to later. For now I’ll just flip this around: if the prediction doesn’t matter, if because “we’re going to find out [the outcome] eventually” then it’s pointless to care about prediction, then why do grown men and women put such effort into making these predictions? Some even do it for a living. Why? Do they not have meaningful things they could be doing? Is the whole exercise just a game? I can’t believe that Andrew and others, who seem thoughtful and conscientious, really think so.

        I’ll try to write more later, but I’ve got work to do — work that, I hope, has an actual point.

        • Why do they do it? Because we are prediction machines. We want to predict everything whether it makes sense to or not. In climate research, medicine, and many other areas it makes sense to predict and to need to get it right. In this specific scenario of predicting who will be president tomorrow, it’s not useful the wy it is in medicine etc.

      • Folks could have checked out this website prior to it crashing ;-)

        Ruth Graham 11:14 PM
        With a Trump win looking increasingly possible, Canada’s Citizen and Immigration website has crashed. Despondent Americans looking for information on fleeing to Justin Trudeau’s frozen paradise are currently being greeted with an “internal server error” message.

        Having a good sense of who will win an election prior to the polls closing could be important to many.

      • @Shravan

        I’m kinda with you the predicting elections is stupid. But my point is that now that we have put so much time and effort into it getting it right matters! Matters a lot.

        I’d have been with you had you exhorted people a priori to not waste so much time predicting election outcomes but instead to devote it to (say) predicting who will get a heart attack or something.

        But now that they have all done such a crappy job predicting after putting in so much effort and the best of methods we cannot just brush it away saying “it doesn’t matter”.

        • I agree with you on all these points. It’s worth trying to get it right for its own sake. But it has no practical utility the way prediction in medicine, climate research, and a lot of other areas has.

          Predicting elections on such a short time window probably has only one practical implication, i.e., making money or profiting in some other way from buying the right stocks and/or betting on the election itself, and/or aligning yourself with the right person.

          I guess another utility is that it’s a good sanity check on how good statistical models are at predicting stuff correctly. It’s an academic research problem, and there’s nothing wrong with that (you should see the kind of stuff I work on). I was just thinking aloud: why are we doing this in the first place? I would never ask that question about medicine.

        • One other point: We criticize Psych. experiments for building edifices on shaky foundations of crappy measurements.

          But what’s the quality yardstick of typical opinion poll and other such inputs that go into poll prediction models? Are such measurements high quality and more robust? How noisy are pollster inputs? When polls are so damn close are we capable of extracting any signal from the baseline noisiness?

          What’s the analysis on this account?

          I’m skeptical because the few times I answer a survey because Dell calls me or there’s click-bait title in my Inbox or some tantalizing incentive I’m later aghast by the horrible or endless questions that follow & mostly end up clicking random responses.

          If I’m anywhere close to the median respondent then you are just modelling garbage.

        • I also had a polling experience once. I was relatively new to Germany and my German was still shaky. I knew nothing about German politics. A poller calls me and starts asking me questions about my political leanings. It was the first time anyone polled me so I started to answer more or less at random. Between the leftists and the CDU, which one do you prefer? CDU. Do you think Lafontaine should be chancellor? Sure. He was totally confused and kept checking with me if I really meant to say what I just said. He did complete the poll though. It was totally crazy. He could have figured out quickly I had no idea what I was talking about.

  3. I’ve been updating my facebook feed using Kremp’s script. Super handy, even if it exaggerates confidence :/.
    So far, this has been heartbreaking, to say the least. What a weird election.

    Now I wish I had data on alcohol sales leading up to the election.

  4. It’s nice to see frank admissions of overconfidence, even after the fact. (The one on the Slate blog after 15 minutes was great!) Still, I have high confidence that you remain overconfident, although much less overconfident than most. Including me. But not including Nate Silver, dammit.

    Personally, I was sure that the Cubs winning the world series had nothing to do with the Apocalypse one way or the other. Yet here we are.

  5. I find it strange that you blame the massive failure of polls de tout poil on correlation. The repeated incapacity of polls to predict political results seems to me more indicative of a chasm between the data collected and the final outcome, either because the data does not relate to the true population or because the answers collected at poll time are not the choices expressed at voting time. This sounds like a call to arms to reconsider polling altogether.

        • it’s like any model. if you have information, you attempt to predict the correlated residuals.

          If you don’t have information that helps to predict the correlated errors, modeling errors as correlated leads to overdispersed prediction intervals.

          If they get wide enough, then as xian says, why bother using the data?

    • The “bayesian adjustment of random digit dialing telephone polls” is still based on the idea that random digit dialing telephone polls have some kind of informative validity. If you call 10,000 numbers and get 1000 answers, what’s the worst that the bias could be?

      My impression is that modelers don’t put a “the whole process could have a bias that’s normally distributed at +- 5% or so” not because they don’t believe that’s true, but because if you do that you wind up with not much better than asking your aunt Greta who she thinks will win.

    • Xi’an: the thing is, if the polling errors weren’t correlated, even if each poll had a very large error they would still average out pretty well. But the polling errors *were* correlated, which actually supports your general idea that we should put less faith in them – because it seems like they can all be wrong in the same direction at the same time.

  6. None of the polls predicted 0% chance of victory for Trump, i.e. the polls were saying there was a non-zero chance he would win. He won. I don’t see the contradiction.
    Sometimes black swans happen.
    This was one of those times.
    No need to re-examine the basics unless this type of thing keeps happening, which will indicate something beyond a fluke – a systemic problem with the methodologies behind these polls.
    But now is too soon to start doing that.

    • Even if the data by which the models/polls can be evaluated were really only just a coinflip outcome, it would be considerable evidence against the models/polls if it’s heads when you gave tails a probability greater than 99% as some did.

      But there’s more data, there are all the state polls and the national polls, and when Trump wins Wisconsin and Michigan when not a single of the many polls there EVER showed him ahead, then that’s unlikely to be mere chance.

  7. Two things strike me about the polls, models, and reactions. First, despite all our education and knowledge, the human tendency to want certainty is stronger than we realize. Even if a poll/model predicted 99% chance of Clinton winning, it was not 100%. So, why should we react as if something was wrong when the 1% event happens? Isn’t that the whole idea against NHST just played out in the election context?

    Second, to the extent there is a systemic error in how we are polling and modeling, it seems to lie in how we treat nonresponse. It is not just a matter of nonresponse bias that might be picked up in party affiliation. More and more, people are not really affiliated with any party – and, apparently – more and more nonresponse actually conveys some meaningful information. In this case, it was a sign that people wanted to be unpredictable and wanted a change, regardless of whether that particular change was something they liked. I think this can be modeled, but I suspect the noisiness in the data has just increased.

    Finally, for all the claims made for algorithms replacing human judgement, this election should give some pause. This is an opportunity to improve the models – until the next surprise catches us again (Fooled by Randomness). It may still be the case that human judgement performs worse than even flawed algorithms, but that is setting the bar too low – sort of like this election, where one deeply flawed candidate was seen as better than the other deeply flawed candidate.

    • > This is an opportunity to improve the models – until the next surprise catches us again

      That’s always ever present – even with human judgement.

      Models (representations) are just (hopefully) purposely reflected upon human judgements (judgement on steroids?) wherein surprises should be more easily sought out and discerned.

      Now when time is limited and little to no time to check for model fit (surprises) less reflected upon models can be better?
      Some argument for that here Heuristics: Tools for an Uncertain World http://onlinelibrary.wiley.com/doi/10.1002/9781118900772.etrds0394/abstract

      • Gigerenzer has always portrayed his work as a disagreement with Kahneman, et al, but I don’t quite see it that way. Heuristics are both necessary (for survival reasons even) and flawed (due to all sorts of cognitive biases). Models are increasingly necessary to deal with the complexity of a technologically mediated world – but have their own flaws. Clearly we need both, but I don’t think humans have evolved a good ability to use both for decision making. We use them both – but it is not clear that we are making good decisions.

  8. Another issue about the polls legitimacy we should never forget is the so called house-effect, i.e. the natural tendency of each pollster to adjust their final estimates in order to correct for biases due to non-sampling errors (Kremp and Linzer actually took into account this in their models). I am not an American citizen and I am not expert about US polls, but I may guess that most of the missingness in the answers has been adjusted towards the “Clinton’s side”. Then, are most of the pollsters a priori democratic?

  9. I wonder about two things here. First, is it possible that more voters changed their mind at the last minute than in past elections? The erratic, go-with-the-gut character of this campaign may have affected the voters themselves.

    Second, is there maybe something anomalous about Trump that breaks through the predictive models, something we really haven’t dealt with before? (Um, yes… but I’m not sure how the predictions could have accounted for it.)

    I appreciate the opportunity to see the forecast updating model and play with it. It got me to install R. I think I have to set R up properly, with all the required packages, before I can get the code to run; I am traveling now but look forward to resuming the effort. It has been a while since I worked with code in any substantial way, but it’s coming back.

Leave a Reply

Your email address will not be published. Required fields are marked *