Hey pollsters! Poststratify on party ID, or we’re all gonna have to do it for you.

Alan Abramowitz writes:

In five days, Clinton’s lead increased from 5 points to 12 points. And Democratic party ID margin increased from 3 points to 10 points.

No, I don’t think millions of voters switched to the Democratic party. I think Democrats are were just more likely to respond in that second poll. And, remember, survey response rates are around 10%, whereas presidential election turnout is around 60%, so it makes sense that we’d see big swings in differential nonresponse to polls which will not be expected to map to comparable swings in differential voting turnout.

We’ve been writing about this a lot recently. Remember this post, and this earlier graph from Abramowitz:

image001

and this news article with David Rothschild, and this research article with Rothschild, Doug Rivers, and Sharad Goel, and this research article from 2001 with Cavan Reilly and Jonathan Katz? The cool kids know about this stuff.

I’m telling you this for free cos, hey, it’s part of my job as a university professor. (The job is divided into teaching, research, and service; this is service.) But I know that there are polling and news organizations that make money off this sort of thing. So, my advice to you: start poststratifying on party ID. It’ll give you a leg up on the competition.

That is, assuming your goal is to assess opinion and not just to manufacture news. If what you’re looking for is headlines, then by all means go with the raw poll numbers. They jump around like nobody’s business.

P.S. Two questions came up in discussion:

1. If this is such a good idea, why aren’t pollsters doing it already? Many answers here, including (a) some pollsters are doing it already, (b) other pollsters get benefit from headlines, and you get more headlines with noisy data, (c) survey sampling is a conservative field and many practitioners resist new ideas (just search this blog for “buggy whip” for more on that topic), and, most interestingly, (d) response rates keep going down, so differential nonresponse might be a bigger problem now than it used to be.

2. Suppose I want to poststratify on party ID? What numbers should I use? If you’re poststratifying on party ID, you don’t simply want to adjust to party registration data: party ID is a survey response, and party registration is something different. The simplest approach would be to take some smoothed estimate of the party ID distribution from many surveys: this won’t be perfect but it should be better than taking any particular poll, and much better than not poststratifying at all. To get more sophisticated, you could model the party ID distribution as a slowly varying time series as in our 2001 paper but I doubt that’s really necessary here.

39 thoughts on “Hey pollsters! Poststratify on party ID, or we’re all gonna have to do it for you.

  1. Obviously the swings are less severe than the polls capture – but isn’t some of the variation in actual voting due to differential voting turnout? If so, this probably largely follows enthusiasm – which is captured by differential nonresponse to polls.

    Wouldn’t poststratification remove this effect entirely?

    • David:

      You can have a model in which the distribution of party ID changes slowly over time, as in our 2001 paper, and you can model the probability of turnout varying too. But these short-term swings, I think they’re essentially all coming from changes in nonresponse that say very little about likelihood to vote and much more about who feels like participating in a poll.

      • I agree that you can model those factors, and that clearly helps to capture these changes – but a multilevel model helps most if the underlying generating process is stable. (Or is changing in a way your model captures.) But…

        It is plausible to me that something fairly de-novo, like Trump’s candidacy, (plus a resulting more-viable third party vote, especially in a state like Utah,) is captured better by a more model-free / non-parametric approach, instead of the sort of model-driven approach you usually employ. Of course, I can justify this in a Bayesian framework as using very different priors, but in practice it’s probably more-or-less equivalent to what Sam Wang is doing.

        • David:

          No, I don’t think millions of voters switched to the Democratic party during those five days. And “model-free” doesn’t really mean anything. The implicit assumption of pollsters right now is that there is zero variation in differential nonresponse by partisanship. And why should that be? Why, in this tumultuous campaign season, would we expect zero variation? That makes no sense at all. I think what happens is that certain assumptions are “grandfathered in.” They’re assumptions people make implicit so they don’t seem like assumptions. But, believe me, an assumption of zero variation in differential nonresponse is a very strong model.

        • Right, that’s like watching the water slosh around in your glass and assuming there must be wild changes in the gravitational constant of the earth, when in fact your glass is on a bar in a cruise ship in the middle of a storm.

        • I certainly agree with your point – and your seminal paper on differential response to pollsters is correct. On the other hand, I think that the implicit assumptions of pollsters are that their consumers expect the polls to work the way they expect, and pollster aggregates like Sam Wang and Nate Silver are there to, to some extent, correct for that by re-weighting polls, and forecast the election. The accuracy of polls taken right before the election, however, implies that differential response and differential likelihood to vote are related

          Your point about “model free” is fair – but, if I understood correctly, simple poststratification makes similar strong assumptions. I’d wonder about estimating the relationship between polling response rates and voting behavior – because I (weakly) suspect that the volatility in response rates reflects (though also probably magnifies) the volatility of turnout.

          Following advice I’ve seen you and others give about constructing generative models, I’d wonder if building such a model would help here – or if, as I suspect, there is insufficient data (and useful variation) in the past 50 years of elections and polls to materially inform a model with so many degrees of freedom.

        • David:

          A model adjusting for party ID can definitely do better: see our Xbox paper, in which we do better than the polling aggregators. There’s definitely “sufficient data” to outperform the crude model that makes the ridiculous assumption that there is zero differential nonresponse. It’s my impression that the professional political consultants—the best ones, at least—do these adjustments, cos it’s worth it to their clients to really know what’s going on.

        • An implicit point of these discussions is whether we ought to judge a model by its assumptions or its performance.

          e.g. Can a model with one ridiculous assumption give satisfactory performance? I think yes, often times.

        • Andrew: I completely agree that zero differential response is wrong. The question I tried to ask (but failed to do so clearly,) is whether the data is sufficient to estimate the relationship between polling nonresponse and turnout. That seems harder, but is potentially a way to utilize the information that representative polls can provide that non-representative polls would not. Because as long as the representative polls aren’t taking advantage of that type of data, I’d agree that non-representative polls are just as effective, and much cheaper.

          Rahul: If it gets lucky in a single case, it’s just a bad model. If a model with a “ridiculous” assumption is performing correctly across the range of inputs, it’s implicitly a good simplification of the correct model. If it only works conditioned on factors that may change, it’s useful for prediction (a la machine learning) but not nearly as useful for manipulation or decision making.

  2. I think it’s an excellent point, but one which might get lost in all the media bashing of LongRoom.com. For example, Harry Enten, the “data journalist” over at fivethirtyeight wrote a post largely dismissing non-response bias the other week: http://fivethirtyeight.com/features/the-polls-arent-skewed-trump-really-is-losing-badly/

    Other than the fact that us snooty statisticians have reserved the word “skew” for a phenomenon that ain’t the same thing as bias, I think he’s overly dismissive of changes in party identification. Isn’t he supposed to be the one of the guys who believes in models?

    He’s a New Yorker too, maybe you can stop by and have it out with him. Go spiritually Bayesian and make a wager on it: the loser gets to swim 100 meters in Newtown Creek.

      • Longroom.com was the 2016 version of “unskewing” the polls (after the unskewedpolls.com debacle in 2012 which claimed that Mitt Romney was going to win in a landslide). Hence also why 538 used the word “skewed.”

        I think to reach a synthesis, it’s fair to say that the problem with longroom was its assumption of a structural, rather than temporarily varying, non-response bias. Longroom posited that Democrats and Republicans should have equal party weighting, and thus all the polls were biased, and thus that Trump is currently winning the presidential race. While assuming zero variance in non-response bias is a faulty assumption, it is likely not as faulty (as Harry demonstrates) as assuming zero variance in party identification about a mean estimate that is out-of-step with historical data.

    • This:

      One can estimate this using some sort of smoothed average. The point is that the distribution of party ID is changing much more slowly than the distribution of vote preference, which itself is changing much more slowly than differential nonresponse.

      To put it another way, to not poststratify on party ID is to implicitly assume that the best estimate of party ID in the electorate is just . . . whatever the responses happen to be on your latest survey. It shouldn’t be hard to do better than that!

    • To add to Andrew’s point, we can also estimate party identification from surveys like the American National Election Study, General Social Survey, and others. These surveys have higher response rates than most media polls because interviews are conducted face-to-face, repeated attempts are made at each household, incentives are offered to motivate participation for especially hard cases, etc. Pollsters are usually under intense time and business pressures, making it difficult for them to do these things.

      Its also worth noting that party affiliation is an identification, which makes it fairly stable over time (identities do change, of course, but not for large swaths of the public on any given day).

      • If we accept that party identification is stable but candidate choice fluctuates a lot, is that tantamount to saying that there’s a lot of reported cross-voting? Is there a lot of cross-voting reported in polls?

        • Well, there’s certainly greater instability in candidate choice measures than party identification–especially if taken early in the election cycle (i.e., the polls are noisier early in the campaign; see Will Jennings and Chris Wlezian’s, 2016, AJPS paper–but even these are relatively stable for partisans as Election Day draws near.

          I think as Andrew noted you’re seeing fluctuations in the polls due to differential response rates. So it’s not that Republicans are backing the Democrat and vice-versa (although there are some candidates who can draw out-partisans, e.g., Reagan) but that any given poll may have fewer Republicans (or Democrats) in that poll. If pollsters don’t adjust their survey weights accordingly, they’re likely to see shifts in polling numbers that simply aren’t there. And then there are difficulties related to how we categorize leaners or weak partisans. Are they Independents?

          I used to run a survey center and can tell you that it’s difficult work to get good survey response rates, particularly among some demographics. And this would be for academic surveys that typically took several weeks (allowing multiple attempts staggered over times of day and days of the week, including weekends). Most polls are conducted over a couple of nights at most, so some people never really get a fair chance to be included in the sample. Applying good survey weights can help mitigate some of these issues, but that’s assuming that the individuals you’re now upweighting are representative of their respective groups (which they may very well not be). But I digress.

          So, I would say no, that you’re not seeing a lot of cross-voting in the polls. Even with a polarizing candidate like Trump, it’s hard to believe that many Republicans would really shift their votes for Clinton (although they may say they’d do so early on).

  3. Ok – so let’s assume you’re right (and this seems reasonable to me). Why aren’t pollsters doing it? I imagine part of it is inertia in a polling world thats been slow to give up on some ideal world of random sampling and near universal response, but pollsters are post stratifying quite a bit already on basic demographics. What’s different about party?

    • Joel:

      Good question. Part of it, I think, is that sample surveys is a conservative field. Another part is the incentives: in some ways it’s good to have your poll numbers jump around because this generates headlines. Yet another issue is that response rates keep going down, so I’m guessing that differential nonresponse is a bigger deal than it used to be.

      • Is the differential non response strongly correlated to the response rate itself? i.e. As response rate drops do we know that more Democrats (or Republicans) respond?

  4. Dumb question: Doesn’t that plot show that the more Democrats there are in a sample (partyID; x-axis) the more they are likely to vote for Clinton (y-axis)?

    If so, why is that interesting? It’d only be interesting if we knew something like “People only change their candidate preference but never their PartyID” or in other words “People whose opinion changed and they’ve decided to vote for Clinton will still respond to a question of PartyID as Republican”

    In other words if the x axis was something immutable like color, sex etc. that’d make sense. But if both partyID & vote-preference changes how do we conclude anything from this graph?

    • Rahul:

      In one way, the graph is not interesting at all! But it’s relevant to how polls are reported. Given that the distribution of party ID in the population is not changing fast, this suggests that apparent big swings in the polls are just the product of changes in patterns of nonresponse.

      That is, what seems like big news, isn’t.

      • Interesting. How can we tell that partyID in the population does *not* change as fast as the other variable, candidate preference?

        Do we have a number as to how much more likely people are to switch to other-part candidates than to switch own party-identification?

        PS. In typical polls how big is the share of “cross-voters” i.e. people who identify with Party-A but say have voted for the Party-B candidate?

  5. The last four Presidential elections have been driven in large measure by the Marriage Gap, with Republicans doing much better among married white people than among unmarried white people. Unfortunately, few pundits talk about the Marriage Gap, whereas everybody talks about the Gender Gap. Exit polls even sometimes neglect to ask about marital status.

    On the other hand, Trump could be such a wild card that he could mix up what has been a pretty stable century for Presidential voting. This could be one of those years in which the Gender Gap is much more important than the Marriage Gap.

    Or maybe not.

  6. Andrew,

    How do you disentangle the effect of differential non-response from the simple affirmation of current preference for which party identification questions are often proxies among those that are “on-the-fence”? In other words, party identification need not reflect an actual change in registered Democrats vs. Republicans, or even differential non-response, as you’re implying, although I agree the latter is likely an effect. But there is another possibility—it could be that there are certain people for whom that question is a mere reflection of their current opinion of the presidential candidates, and so post-stratifying based on party ID would overcorrect, and take out some of the actual change in opinion.

    How would you go about accounting for that?

  7. Andrew – why isn’t it sufficient to post-stratify on age/education/race/gender? As far as I can tell, your XBOX paper doesn’t emphasize the importance of party ID above these other factors. Couldn’t we get the party-ID stratification “for free” by stratifying on these other variables which voters are less likely to fudge on?

  8. Professor Gelman,

    Please post YOUR election predictions – poststratified or not. I shall look forward to comparing your accuracy to that of Professor Sam Wang of the Princeton Election Consortium, the New York Times’ Upshot, Nate Silver of FiveThirtyEight, and Electoral-Vote.com.

    For the record, my bet is on Professor Wang, who admirably refrains from putting any finger on the scale, be it to “poststratify”, compensate for “poll bias” or add special sauces (à la Mr. Silver), or to otherwise unskew.

    • Olav:

      Why is poststratifying “putting a finger on the scale”? Survey respondents are not a simple random sample from the population of voters. All pollsters make adjustments. If you really want to keep your finger off the scale, try just looking at raw numbers. You can admirably refrain from doing anything to the data at all. Take that approach and you’ll see huge swings in the polls every week (see the Abramowitz quote above) which I’m pretty sure have a lot more to do with swings in nonresponse than in anyone changing their opinion.

      • All pollsters make adjustments.
        The highly respected poll aggregator Sam Wang does not. That’s my point.

        Please post your own election prediction — and let’s compare.

        • Olav:

          Poll aggregators do use adjustments. They just take whatever adjustments the polling organizations happen to use. But what my colleagues and I are saying is that those adjustments are not enough. It’s from the unadjusted polls that people got the mistaken impression of large swings in public opinion during the conventions, when what was really happening was large swings in differential nonresponse.

  9. “It’s from the unadjusted polls that people got the mistaken impression of large swings in public opinion during the conventions, when what was really happening was large swings in differential nonresponse.”

    That mistaken impression only happens when people overemphasize single polls. Which of course is what the media does all the – and it generates lots of headlines based on selective. You will find no such volatility with a reliable poll aggregator such as the Princeton Election Consortium. I think you and I agree that it is a serious mistake to focus unduly on any one poll!

    The history of PEC’s Electoral Vote Estimate and Meta-Margin, it has been remarkably stable. Why is that? It’s simple, really: those polls even out. Moreover, Wang considers only state-wide polls, not national polls. Moreover, since his calculations are based on the median, and not the mean, it means that outlier polls do not generate undue volatility.

    Please examine the history of Dr Wang’s Meta-Margin and EV Estimator, and you will see how remarkably stable it is. (The Meta-Margin is defined as the amount of opinion swing that is needed to bring the Median Electoral Vote Estimator to a tie.)

    http://election.princeton.edu/history-of-meta-analysis/

    NB. Since May, Clinton has had a Meta-Margin advantage of at least 2.5 %; It’s currently 6.3 %. Her Electoral College Estimate is currently 341, and it since May it has never dropped below 305 Electoral Votes. That is stable!

    Those who believe Trump has ever been ahead, or even close to tied, have a mistaken impression because they are looking at single polls or the wrong data, failing to aggregating the state polls that are available.

  10. Hi, I don’t know if you’ve touched on this beforehand, but does the fact that grouping partisan leaners in their respective parties has not become a uniform practice across different pollsters make this practice of weighting by partisanship more difficult?

    On a related note, you suggested “the simplest approach would be to take some smoothed estimate of the party ID distribution from many surveys” for this weighting. Here’s what I found as the most accessible resource for that–http://elections.huffingtonpost.com/pollster/party-identification–but it obviously doesn’t group leaners. That’s not what we want for weighting by party ID, is it?

Leave a Reply to Andrew Cancel reply

Your email address will not be published. Required fields are marked *