Nate Silver and Justin Wolfers are having a friendly blog-dispute about momentum in political polling. Nate and Justin each make good points but are also missing parts of the picture. These questions relate to my own research so I thought I’d discuss them here.
There ain’t no mo’
Nate led off the discussion by writing that pundits are always talking about “momentum” in the polls:
Turn on the news or read through much of the analysis put out by some of our friends, and you’re likely to hear a lot of talk about “momentum”: the term is used about 60 times per day by major media outlets in conjunction with articles about polling.
When people say a particular candidate has momentum, what they are implying is that present trends are likely to perpetuate themselves into the future. Say, for instance, that a candidate trailed by 10 points in a poll three weeks ago — and now a new poll comes out showing the candidate down by just 5 points. It will frequently be said that this candidate “has the momentum”, “is gaining ground,” “is closing his deficit,” or something similar.
Each of these phrases are in the present tense. They create the impression that — if the candidate has gone from being 10 points down to 5 points down, then by next week, he’ll have closed his deficit further: perhaps he’ll even be ahead!
But, as Nate points out, this ain’t actually happening:
Say that a candidate has improved her position in the polls from August to September. Is her position more likely than not to improve further from September to October? . . . this is not what we see at all . . . Sometimes, a candidate who has gained ground in the polls continues to do so; otherwise, the trend reverses itself, or the race simply flatlines. . . . There is also no sign of momentum we look at the change in polling between other periods. . . . In general elections, the direction in which polls have moved is not predictive of the direction in which they will move. [italics added]
I like Nate’s analysis. It’s very much in the Bill James style, but with graphs.
Consider the time scale
Enter Justin Wolfers, who writes that Nate is all wrong, that there is momentum in political polling.
Justin argues that Nate made a mistake by using the same data in his “before” and “after” comparison. Suppose you have poll averages at times A, B, and C. Nate is saying that the change from A to B does not predict the change from B to C (or, to be precise, that the change from A to B is a very weak and negative predictor of the change from B to C). Justin says that you should gather data at times A, B, C, and D, and see if the change from A to B predicts the change from C to D. Using a time series of unemployment data (since he didn’t have ready access to poll summaries), Justin indeed finds a positive correlation between B-A and D-C.
How to adjudicate the Silver/Wolfers dispute? My quick answer is that they’re both correct. Yes, Justin is correct that if you want to measure the persistence of trends, the D-C vs. B-A comparison can be better than the C-B vs. B-A comparison.
But Nate is correct too in that he’s answering a direct question. People really do look at the change from A to B and then try to extrapolate it to C. And Nate’s analysis directly addresses the problems in trying to do this extrapolation.
From a statistical standpoint, any study of trends has to consider the time scale. At the shortest time scales, there is certainly “momentum” in the underlying time series but the actual measurements are verv noisy, making it difficulty to see any trend. At longer scales, you can average to get lower noise levels, but then trends are not so stable. You can see some of this from the unemployment series that Justin posted, which is a mix of long-term variation on the 5-year scale and short-term noise of the monthly measurements.
Public opinion is not a random walk
Nate does slip up at one point, when he writes:
In races with lots of polling, instead, the most robust assumption is usually that polling is essentially a random walk, i.e., that the polls are about equally likely to move toward one or another candidate, regardless of which way they have moved in the past.
This isn’t quite right. For many races, you can use a forecast from the fundamentals to get a pretty good idea about where the polls are going to end up. Our original example here is the 1988 presidential election campaign. Even when Michael Dukakis was up 10 points in the polls, informed experts were pretty sure that George Bush was going to win. Or, if you want to focus on congressional elections, take a look at the work of Erikson, Bafumi, and Wlezien, who find predictable changes in the generic opinion polls in the year leading up to the election.
Nate’s error, I think, is coming from two sources. First, individual polls are noisy, and so any immediate changes are likely to be noise. Second, the random walk is such a standard paradigm for statistical noise that it’s natural for Nate to use it as a default. “A random walk down Wall Street” and all that. But polls are not stock markets. As I’ve said many times, a poll is a snapshot, not a forecast. There really can be predictable changes in the polls, even if they’re hard to notice amid short-term variation.
If you switch the application area from politics to baseball, I think Nate would see this right away. Are the baseball standings a random walk? No. If a team has an unexpectedly good record midway through the season, it’s likely that they will slip in the standings during the second half. (I haven’t looked at the actual numbers for baseball teams, but general statistical principles would suggest that the best prediction for a team’s winning percentage during the second half of the season, given the team’s record in the first half, would be something in between the preseason prediction and the team’s midseason record.)
Why this bugs me
From a political science standpoint, I want to continue pushing back hard on this because I believe the random walk story has contributed to major misunderstandings about politics. (I was going to write “the random walk story contributes to . . .” but then decided to follow Nate’s recommendation to describe the past as past rather than to implicitly extend trends into the future.)
The natural accompaniment to the random walk model is the idea that, if you can shift the polls by X percentage points at any point during the campaign, this will give you an expected X percentage point advantage when the election comes around. Another implication is that George Bush’s campaign was so awesome and Michael Dukakis’s campaign was so horrible to explain the big swing in the polls in the months leading up to the 1988 election. (We think a much more plausible story is that the polls were gonna swing toward Bush big-time, and the perceived incompetence of Dukakis’s campaign was a consequence, not a cause, of this polling shift.)
In summary, “momentum” can exist, but the places where you’ll see it is in races where current public opinion is out of step with best predictions. The mere information that a race has a 5-point swing is not enough to predict a future shift in that direction. As Nate emphasizes, such a prediction is only appropriate in the context of real-world information, hypotheses of “factors above and beyond the direction in which the polls have moved in the past.”