MRP Carmelo Anthony update . . . Trash-talking’s fine. But you gotta give details, or links, or something!

Before getting to the main post, let me just say that I’m a big fan of Nate Silver. Just for one example: I’m on record as saying that primary elections are hard to predict. So I don’t even try. But there’s lots of information out there: poll data, fundraising numbers, expert opinion, delegate selection rules, etc etc. I admire that Nate’s out there working on this. In academia we often work over and over on the problems we already know how to solve, squeezing out marginal improvements. That’s fine. But I admire someone like Nate who attacks all these problems, using whatever information and tools are available. That’s important.

OK, now on to the main post:

I really can’t figure out what’s going on here. Nate Silver is an admirable statistical analyst, full of good ideas, and then . . . .

OK, here’s the time line:

2009: Nate and I publish an article with me in the New York Times using multilevel regression and poststratification (MRP):

2011: Nate says Carmelo Anthony is underrated:

2012-2016: Nate does poll aggregation (using published numbers from public polls), and my colleagues and I analyze raw data from opinion polls to get national and state-level averages. Nate’s using reported toplines so he doesn’t need to worry about MRP, survey weighting, etc.; we’re working with raw data so we use MRP to adjust for known differences between sample and population, and to get stable estimates for geographic and demographic subgroups.

Again, I can see why Nate doesn’t need MRP—he’s using the weighted estimates that survey organizations supply in their public summaries, and I guess that’s usually a reasonable choice. He’s a poll aggregator, and that’s what poll aggregators do: they aggregate polls.

And, again, here are some reasons why we do need MRP (or something like it):

– Estimating public opinion in slices of the population

– Improved analysis using the voter file

– Polling using the Xbox that outperformed conventional poll aggregates

– Changing our understanding of the role of nonresponse in polling swings

– Post-election analysis that’s a lot more trustworthy than exit polls

2018: Nate characterizes MRP as “the Carmelo Anthony of election forecasting methods.” At first that sounds cool—MRP is awesome, and Nate’s on record as saying that Carmelo is underrated—but, no, it’s not so cool because, apparently, sometime between 2011 and 2018, Nate’s decided that Carmelo ain’t so great, and sometime between 2009 and 2018, Nate’s decided that MRP ain’t so great.

In 2018, Nate described MRP as “not quite ‘hard’ data.” This is really weird. Nate knows a lot about data—and so do I! And I don’t know what the hell Nate is talking about. The data used in MRP are as “hard” as the data used in all the polls that Nate is reporting. The only difference is, he’s buying the weighted-estimation sausage pre-wrapped in the supermarket, and we’re back in the butcher shop preparing it all. (I’m just riffing off the line about sausage and legislation.) What does he think, that these these survey weightings just come out of nowhere? Nope! They’re full of subjective decisions. And see here for what happens if these decisions aren’t well thought out.

I think the old, pre-2018 Nate Silver would’ve thought it was really cool how we adjusted the survey data using party ID to get a more accurate result. What happened to the old Nate Silver, who wanted to use the most up-to-date statistical methods to get the best possible answers? He seems to have been replaced by a new, trash-talking Nate Silver who throws around terms like “hard data,” as if whatever summary report put out by survey organization XYZ using whatever adjustment they happen to use, is somehow on the “hard data” pedestal.

Look. Survey organizations are great. Surveying is hard work. But these published toplines? They’re made by humans. MRP estimates made by Yougov, or Langer Research Associates, or Microsoft Research, even Columbia University—these are hard data too, just as hard data as anything produced by exit polls, or the New York Times, or even ESPN.

2020: A colleague who wishes to remain anonymous points me to this new blast from Nate:

My objections to MRP are almost entirely *practical* not theoretical. There’s nothing wrong with smoothing or blending polling data with some other inputs in a clever way to produce more accurate forecasts. That’s what 538 has been doing for 13 years!

But you label it “MRP” and all of the sudden people who ought to know better think it’s a technique with God-like powers. Worse, they use it as an excuse to not be very transparent, in terms of not publishing raw inputs or detailed outputs or much in the way of methodology.

Nor have many of the MRP models I’ve seen spent much time working on estimating uncertainty, including the possibility of correlated errors, which is where often *most* of the work is in modeling if you want your models to remotely resemble real-world conditions.

In the very first MRP paper from 22 years ago we had correlated errors (that is, our model had variation at the regional as well as state levels).

I disagree that most of the work in survey adjustment is in correlated errors. Consider the notorious state polls in 2016 in Michigan, Wisconsin, and Minnesota. It’s our impression that the big problem there was not correlated errors but rather that these surveys did not adjust for education. And the other problem was poll aggregators—including me!—focusing on the aggregation and not looking carefully enough at where those numbers were coming from.

Back in 2016, both Nate and I should’ve done a bit more MRP and a bit less “smoothing or blending polling data with some other inputs in a clever way.”

Moving forward: I think it’s pretty clear that when Nate is criticizing MRP in polling, he’s not criticizing what I do, or what David Rothschild at Microsoft Research does, or what YouGov does, or what Gary Langer does, or what other respected pollsters do. And I don’t think he’s criticizing the research of Jeff Lax, Justin Phillips, or the other political scientists studying state and local opinion using MRP.

Specifically, Nate is criticizing “people who ought to know better” who “think it’s a technique with God-like powers.” Who are these people? I’d like to at least see some references of people who think MRP is “a technique with God-like powers.” What are the MRP models that Nate has seen that have not considered the possibility of correlated errors?

This is the goddamn internet, for chrissake. If you have a claim that you can support, then link to it already!!!!

If there are people out there “who ought to know better” who think MRP is “a technique with God-like powers,” then tell us who they are. If there people who use MRP “as an excuse to not be very transparent, in terms of not publishing raw inputs or detailed outputs or much in the way of methodology,” then tell us who they area. Give us some quotes, give us some links. And I’ll be first in line to explain how MRP can go wrong. Indeed, I’ve already done it!

I really don’t understand what’s going on here. Nate does good work. I’d be glad to join him in criticizing bad work, but it’s hard to do so when he never gives any examples.

Trash-talking’s fine. But you gotta give details. Otherwise you’re not trash-talking. You’re just trash.

P.S. Please again read the first paragraph of this post. I think Nate’s great. I don’t understand what’s his problem with MRP—a 22-year-old statistical method that’s still going strong—so it would be good to get to the bottom of this one.

16 thoughts on “MRP Carmelo Anthony update . . . Trash-talking’s fine. But you gotta give details, or links, or something!

  1. I think it’s pretty well known in media that Nate spends half of his time on Twitter trolling these days. You can’t read too much into it. It’s clear he’s lost interest in innovations in forecasting and verged into entertainment and take punditry to a degree that’s probably undermining.

    • Anon:

      Even if Nate spends half his time on twitter, I bet he can get a lot of good work done during the other half of his time. And that’s fine with me: Nate has worked hard, he’s entitled to some relaxation. I just want the specifics from his post. There must be some people out there he’s thinking of, who think MRP is “a technique with God-like powers. . . . Worse, they use it as an excuse to not be very transparent, in terms of not publishing raw inputs or detailed outputs or much in the way of methodology.” I’d like to see some examples.

      • I’m not saying that he shouldn’t be able to relax. I just mean I think there may not be a lot to his MRP musings. To me, this is just him trolling, which he does now with regularity. The argument he’s put forth seems incoherent and is personal and vague.

    • The Democratic Presidential Candidate must be a Straight White Male Movie Star

      Wins Above Replacement. The Democratic Presidential candidate must be the person who will gain the greatest number of victories for Democratic candidates at all levels, local, state, and national.

      If we pick anyone else we kill our own children. If you don’t have children or yours are protected by privilege, you kill other people’s children and the planet itself.

      Ask Nate Silver to form a group to help pick the candidate. Most likely she or he will be a straight white male movie star.

      Greatest number of victories is not an exclusive criteria. The exclusive criteria is that she or he must also be able to work with all the groups in a partnership of ideas.

      You may think your candidate will further the interests of whatever group or segment of the Democratic community you identify with. You may be right. But at the expense of the Democratic Party as a whole.  It raises the chance that Donald Trump will be reelected. It raises the chances that the Senate and House will remain a quagmire while the Republicans continue to pack the Supreme Court with their flunkies.

      The stakes are too high.  Anything we do that results in a victory for the Republican Party will directly come down on our children. But more on the children of poorer people in other parts of the world.  Presidential candidates have money and influence. They may be able to pull themselves and their families onto higher ground. Or fly them there.  The rest of us will drown. Whoever you are, however sincere in your beliefs, whatever sex, race, nationality or sexual orientation, if we do not nominate the individual who will bring the Democratic Party the greatest number of victories across all political levels, we murder our own children.

  2. > tell us who they are.

    They are Men from the Planet Straw.

    > I really don’t understand what’s going on here.

    What is the causal effect on Nate’s future influence/income of trash-talking versus not trash-talking? I doubt it is negative, e.g., Taleb.

  3. >I think the old, pre-2018 Nate Silver would’ve thought it was really cool how we adjusted the survey data using party ID to get a more accurate > result. What happened to the old Nate Silver, who wanted to use the most up-to-date statistical methods to get the best possible answers?

    I suspect if you’re the incumbent election oracle with a huge platform, you’re just not very interested in praising the modelling of others, especially if these approaches have fancy sounding names like MRP – sophisticated analysis is what people should come to you for! Yougov and others should kindly publish the ‘raw data’ and then kindly take a step back and let the professionals at 538 do the heavy duty modelling.

    Besides, MRP is just not very transparent. The gold standard of transparency is a poll-and-other-things-aggregation model whose code isn’t made available.

    • Anon:

      I don’t think it’s that simple. Nate’s a poll aggregator so he should be happy to aggregate results from any pollster (Yougov, whoever) that supplies state-level poll estimates. I don’t think Nate usually works with the raw data, anyway. If he’s willing to defer to pollsters on sampling, interviewing, and adjustment, why not defer to them on regularization too?

      I still feel like there must be some bad MRP out there that Nate’s reacting to. I’d just like to know what it is.

    • More:

      I actually wrote about that example when it came out!

      Following your link, I agree with this criticism from Nate: “if the northern border of Nebraska has extremely high political tolerance and the neighboring counties on the southern border of South Dakota have extremely low political tolerance, something was badly misspecified.” But i don’t think this means that MRP is bad! It just means that this model could be improved. Indeed, whenever you see a county-level map of the U.S., you should look for discontinuities at state borders, as this can represent problems with data or model.

      • Andrew, Thanks for the link. I should have expected that the example was already discussed on this blog! :)

        Still, since this is a study that Nate Silver dislikes and you like, maybe it is a good way to understand exactly what he finds objectionable. The border discontinuity issue is one objection but, as best I can tell, the criticism focuses on Rothschild and Konitzer’s use of 2000 peoples’ survey responses to rank 3000 counties. This matter of inadequate sample sizes is the closest match that I see to Nate Silvers complaints about experts trying to use MRP to “spin straw into gold” and crediting it with “God-like powers”. Though that last statement may be a bit hyperbolic on Nate Silver’s part, depending on your religious beliefs.

  4. Side point! But I don’t like the scales on this graph!

    The question is “Who supports health care reform?”, but the color scheme and the legends make it look like it’s a 50/50 split positive or negative or something.

    But zooming in and reading the text on the right, it’s majority support everywhere! The scale on the right is somewhere between 50% and 90% support!

  5. I’m not inside his head, but it’s worth pointing out that any given technique operates on multiple levels. There’s 1) responsible use by informed practitioners and then 2) there’s irresponsible imitation by others. For obvious reasons I won’t name names, but you can look at a large crop of graduate student papers and see that MRP has reached a point where #2 is fairly common now. Even if we only focus on #1, though, there’s a difference between the work and how “consumers” respond to it. Even if practitioners are very disciplined in how they write things up, 3) certain techniques acquire a sheen that leads “consumers” to believe they are magical and solve problems they don’t solve. I think we’ve evolved to #3 on MRP now and that’s the source of the reaction, which seems clear to me from what he wrote.

    Also, he clearly *doesn’t* like what YouGov is doing. He’s referred repeatedly in the last several days in a generally negative way to the fact that YouGov is producing MRP models not polls (for example https://twitter.com/NateSilver538/status/1224066224247791623 or https://twitter.com/NateSilver538/status/1224376588810248192) and I think this is actually the origin of the remarks in question. I think part of this has to do with a concern over how MRP interacts with poll aggregation as he practices it.

      • It seems like this depends on the model and whether it is doing shrinkage that depends on data outside the pool besides the info about the population distribution of covariates. Like is it using an outcome model that shrinks answers in the present poll towards answers in last week’s poll?

Leave a Reply to More Anonymous Cancel reply

Your email address will not be published. Required fields are marked *