It’s tough being a blogger who’s expected to respond immediately to topics in his area of expertise.
For example, here’s Scott “fraac” Adams posting on 8 Oct 2016, post titled “Why Does This Happen on My Vacation? (The Trump Tapes).” After some careful reflection, Adams wrote, “My prediction of a 98% chance of Trump winning stays the same.” And that was all before the second presidential debate, which “Trump won bigly. This one wasn’t close.” I don’t know what Trump’s chance of winning is now. Maybe 99%. Or 108%.
That’s fine. When Gregg Easterbrook made silly political prognostications, I was annoyed, because he purported to be a legitimate political writer. Adams has never claimed to be anything but an entertainer and, by golly, he continues to perform well in that regard. So, yes, I laugh at Adams, but I don’t see why he’d mind that. He is a humorist.
What interested me about Adams’s post of 8 Oct was not so much his particular opinions—Adams’s judgments on electoral politics are about as well-founded as my takes on cartooning—but rather his apparent attitude that he had a duty to his readers to share his thoughts, right away. The whole thing had a pleasantly retro feeling; it brought me back to the golden age of blogging, back around 2002 and the “warbloggers” who, whatever their qualifications, expressed such feelings of urgency about each political and military issue as it arose.
Anyway, that’s all background, and I thought of it all only because a similar thing happened to me today.
The real post starts here
Regular readers know that I’ve been taking a break from blogging—wow, it’s been over two months now—except for the occasional topical item that just can’t wait. And today something came that just couldn’t wait.
Several people pointed me to this news article by Nate Cohn with the delightful title, “How One 19-Year-Old Illinois Man Is Distorting National Polling Averages”:
There is a 19-year-old black man in Illinois who has no idea of the role he is playing in this election. . . .
He’s a panelist on the U.S.C. Dornsife/Los Angeles Times Daybreak poll, which has emerged as the biggest polling outlier of the presidential campaign. Despite falling behind by double digits in some national surveys, Mr. Trump has generally led in the U.S.C./LAT poll. . . .
Our Trump-supporting friend in Illinois is a surprisingly big part of the reason. In some polls, he’s weighted as much as 30 times more than the average respondent . . . Alone, he has been enough to put Mr. Trump in double digits of support among black voters. . . .
Cohn gives a solid exposition of how this happens: When you do a survey, the sample won’t quite match the population, and survey researchers use adjustments to adjust for known differences between sample and population. In particular, young black men tend to be underrepresented in surveys, compared to the general population, hence the few respondents in this demographic group need to be correspondingly upweighted. If there’s just one guy in the cell, he might have to get a really big weight, and Cohn identifies this as a key problem in the adjustment, that the survey is using weighting cells that are too small, hence they get very noisy adjustments. In this case, the noise manifests itself as big swings in this USC/LAT poll depending on whether or not this one man is in the sample.
There’s also an issue of adjusting for recalled vote in the previous presidential election but I’ll set that aside for now.
Here’s Cohn on the problems with the big survey weights:
In general, the choice in “trimming” weights [or using coarser weighting cells] is between bias and variance in the results of the poll. If you trim the weights [or use coarser weighting cells], your sample will be biased — it might not include enough of the voters who tend to be underrepresented. If you don’t trim the weights, a few heavily weighted respondents could have the power to sway the survey. . . .
By design, the U.S.C./LAT poll is stuck with the respondents it has. If it had a slightly too Republican sample from the start — and it seems it did, regardless of weighting — there was little it could do about it.
This is fine for what it is, conditional on the assumption that survey researchers are required to only use classical weighting methods. But there is no such requirement! We can now use Mister P.
Here’s a recent article in the International Journal of Forecasting describing how we used MRP for the Xbox poll. Here’s a longer article in the American Journal of Political Science with more technical details. Here’s MRP in the New York Times back in 2009! And here’s MRP in a Nate Cohn article last month in the Times.
Mister P is not magic; of course if your survey has too many Clinton supporters or too many Trump supporters, compared to what you’d expect based on their demographics, then you’ll get the wrong answer. No way around that. But MRP will automatically give the appropriate weight to single observations.
Two issues arise. First, there’s setting up the regression model. The usual plan would be logistic regression with predictors for sex*ethnicity and age*education. We don’t usually see sex*ethnicity*age. This one guy in the survey would influence all these coefficients—but, again, it’s just one survey respondent so the influence shouldn’t be large, especially assuming you use some sort of informative prior to avoid the blow-up you’d get if you had zero African-American Trump supporters in your sample. Second, poststratification. There you’ll need some estimate of the demographic composition of the electorate. But you’d need such an estimate to do weighting, too. I assume the survey organization’s already on top of this one.
So, yeah, we pretty much already know how to handle these problems. That said, there’s some research to be done in easing the transition from classical survey weighting to a modern MRP approach. I addressed some of these challenges in my 2007 paper, Struggles with Survey Weighting and Regression Modeling, but I think a clearer roadmap is needed. We’re working on it.
P.S. Someone forwarded me some comments on a listserv, posted by Arie Kapteyn, Director, USC Dornsife Center for Economic and Social Research:
When designing our USC/LAT poll we have strived for maximal transparency so that indeed anyone who has registered to use our data can verify every step we have taken.
The weights we use to make sure our sample is representative of the U.S. population do result in underrepresented individuals in the sample with a higher weight than those who are in overrepresented groups. In general, one has to make a decision whether to trim weights so that the factor for any individual will not exceed a certain value. However, trimming weights comes with a trade-off, in that it may not be possible to adequately balance the overall sample after trimming. In this poll, we made the decision that we would not trim the weights to ensure that our overall sample would be representative of, for example, young people and African Americans. The result is that a few individuals from groups such as those who are less represented in polling samples and thus have higher weighting factors, can shift the subgroup graphs when they participate. However, they contribute to an unbiased (but possibly noisier) estimate of the outcomes for the overall population.
Our confidence intervals (the grey zones) take into account the effect of weights. So if someone with a big weight participates the confidence interval tends to go up. One can see this very clearly in the graph for African Americans. Essentially, whenever the line for Trump improved, the grey band widened substantially. More generally, the grey band implies a confidence interval of some 30 percentage points so we really should not base any firm conclusion on the changes in the graphs. Admittedly, the weight given to this one individual is very large, nevertheless excluding this individual would move the estimate of the popular vote by less than one percent. Admittedly a lot, but not something that fundamentally changes our forecast. And indeed a movement that falls well within the estimated confidence interval.
So the bottom line is: one should not over-interpret movements if confidence bands are wide.
OK, sure, don’t overinterpret movements if confidence bands are wide, but . . . (1) One concern expressed by Cohn was not just movements but also the estimate itself being consistently too high for the Republican candidate, and (2) With MRP, you can do better! No need to take these horrible noisy estimates and just throw up your hands. Using basic principles of statistics you can get better estimates.
It’s not about trimming the weights or not trimming the weights, it’s about getting a better estimate of national public opinion. The weights—or, more generally, the statistical adjustment—is a means to an end. And you have to keep that end in mind. Don’t get fixated on weighting.
P.P.S. Also, I guess I should also clarify this one point: The classical weighting estimate is not actually unbiased. Kapteyn was incorrect in that claim of unbiasedness.