How can news reporters avoid making mistakes when reporting on technical issues? Or, Data used to justify “Data Used to Justify Health Savings Can Be Shaky” can be shaky

Reed Abelson and Gardiner Harris report in the New York Times that some serious statistical questions have been raised about the Dartmouth Atlas of Health Care, an influential project that reports huge differences in health care costs and practices in different places in the United States, suggesting large potential cost savings if more efficient practices are used. (A claim that is certainly plausible to me, given this notorious graph; see here for background.)

Here’s an example of a claim from the Dartmouth Atlas (just picking something that happens to be featured on their webpage right now):

Medicare beneficiaries who move to some regions receive many more diagnostic tests and new diagnoses than those who move to other regions. This study, published in the New England Journal of Medicine, raises important questions about whether being given more diagnoses is beneficial to patients and may help to explain recent controversies about regional differences in spending.

Abelson and Harris raise several points that suggest the Dartmouth claims may be overstated because of insufficient statistical adjustment. Abelson and Harris’s article is interesting, thoughtful, and detailed, but along the way it reveals a serious limitation of the usual practices of journalism, when applied to evaluating scientific claims.

The problem is that Abelson and Harris apply a shotgun approach, shooting all sorts of criticisms at the study without a sense of what makes sense and what doesn’t. For example, they write:

But while the research compiled in the Dartmouth Atlas of Health Care has been widely interpreted as showing the country’s best and worst care, the Dartmouth researchers themselves acknowledged in interviews that in fact it mainly shows the varying costs of care in the government’s Medicare program. Measures of the quality of care are not part of the formula.

For all anyone knows, patients could be dying in far greater numbers in hospitals in the beige [low-spending] regions than hospitals in the brown [high-spending] ones, and Dartmouth’s maps would not pick up that difference. As any shopper knows, cheaper does not always mean better.

Setting the maps aside, could it really be true that “patients could be dying in far greater numbers in hospitals in the beige regions than hospitals in the brown ones”?? I really doubt that, and I’m pretty sure that this has been checked. I mean, that’s an obvious thing to look at. And, in fact, later on in the news article, the authors write that, “a 2003 study found that patients who lived in places most expensive for the Medicare program received no better care than those who lived in cheaper areas.”

So what’s the deal with “For all anyone knows, patients could be dying in far greater numbers in hospitals in the beige regions than hospitals in the brown ones”? Are Abelson and Harris saying there was a problem with the 2003 study, or that there have been big changes in 2003, or . . . ? It’s certainly possible that I’m missing something here myself!

Abelson and Harris then write:

Even Dartmouth’s claims about which hospitals and regions are cheapest may be suspect. The principal argument behind Dartmouth’s research is that doctors in the Upper Midwest offer consistently better and cheaper care than their counterparts in the South and in big cities, and if Southern and urban doctors would be less greedy and act more like ones in Minnesota, the country would be both healthier and wealthier.

But the real difference in costs between, say, Houston and Bismarck, N.D., may result less from how doctors work than from how patients live. Houstonians may simply be sicker and poorer than their Bismarck counterparts. Also, nurses in Houston tend to be paid more than those in North Dakota because the cost of living is higher in Houston.

Huh? One of the striking things about the cost-of-care map is how little it corresponds with cost of living. The high-cost regions include most of Texas, just about all of Louisiana, Mississippi, Arkansas, Oklahoma, and Tennessee (as well as some more expensive places such as the Bosnywash area and much of California.) There may be a lot of problems with this study, but I can’t imagine that one of these problems is a lack of accounting for the (relatively) high cost of living in Houston.

How can this sort of energetic reporting be improved?

I picked on a couple of funny things in the Abelson and Harris, but overall I think they’re doing a service by examining an influential report. My purpose is not to slam them but to suggest how they could do better.

Their fundamental difficulty, I think, is the challenge of writing about a technical topic (in this case, statistics) without the relevant technical experience. (I was going to say “technical training,” but it’s not clear that Stat 101 or even a Ph.D. in statistics or economics will really teach you how to get a sense of perspective when evaluating quantitative claims.) The usual journalistic solution to reporting a technical controversy is to retreat to a he-said, she-said template, quoting experts on both sides and letting the reader decide. To their credit, Abelson and Harris try to do better than this by raising statistical objections on their own–but this strategy can backfire, as in the two examples above.

What should they (and other similarly-situated reporters) do next time? To start with, I’d recommend getting more input from qualified outsiders. Abelson and Harris do this a bit, with a quote from health economist David Cutler–but they only give Cutler one offhand sentence. I’d be curious to hear what Cutler would say about the claim that “for all anyone knows, patients could be dying in far greater numbers in hospitals in the beige regions than hospitals in the brown ones.”

In many ways, the article discussed above is almost all the way there. And with the resources and reputation of the New York Times, these reporters should have no problem getting sound opinions from outside experts that will allow them to focus their ideas a bit.

P.S. The Times should get up-to-date with their web-linking. For example, here’s a quiz for you. Find the problem in the paragraph below:

Wasteful spending — perhaps $700 billion a year — “does nothing to improve patient health but subjects you and me to tests and procedures that aren’t necessary and are potentially harmful,” the president’s budget director, Peter Orszag, wrote in a blog post characteristic of the administration’s argument.

Did you catch that? They quote from a blog but don’t link to it, instead linking to a generic search-for-Peter-Orszag page in the Times archive. I can’t imagine the editors were purposely avoiding the blog link; rather, they’re probably just not in the habit of linking to blogs. (I did a quick google and found Orszag’s blog here.)

P.P.S. Jonathan Skinner, a coauthor of the Dartmouth atlas, responds in detail on the New York Times website. The most impressive thing about this response is that Skinner published it on June 13, 2009–nearly a year before the Abelson and Harris article appeared. Now that’s what I call rapid response!

P.P.P.S. More here by Merrill Goozner, who talks about a number of ways in which it can be difficult to generalize from the Dartmouth study to make policy recommendations.

P.P.P.P.S. Still more here, from Hank Aaron at the Brookings Institution.

19 thoughts on “How can news reporters avoid making mistakes when reporting on technical issues? Or, Data used to justify “Data Used to Justify Health Savings Can Be Shaky” can be shaky

  1. I had some of the same problems you did with this article this morning, but I suspect that the difference isn't raw mortality rate differences, which would indeed stick out blatantly, but risk-adjusted mortality (and morbidity) differences, which are, in my limited experience with them, a product are least as much a product of the modeling framework as of what the data can tell you. So if you just put an implied "risk-adjusted" in front of every discussion of efficacy, and if you grant that "risk adjustment" is both difficult and subject to various political pressures, I think the article makes more sense than you give it credit for.

  2. Jonathan:
    As you mention, the issue with risk adjustment is that it's almost impossible to do well. Comparisons of urban tertiary care hospitals and primary care rural hospitals are almost impossible. The data on why the differences exist is not collected, and despite all data collection, you still would not be accounting for the self selection that occurs between more serious cases and less serious ones within a hospital.

    Complex enough systems with too-small data sets are not amenable to fair statistical analysis, and this may be a case of that. I'd rather see non risk adjusted numbers that are correct than attempt to "correct" the numbers more and inject nearly meaningless risk factors into the data.

    Either way, however, the article should clarify what it is discussing.

  3. Its just misleading to say the researchers further adjusted for any differences in patient.

    they _attempted_ to adjust and with observational data – surely failed to some possibly important extent.

    Also, having had access to the data (somewhere else) that went into a care atlas – its extremely hard even with this access and available documentation and helpful advisor's (who just don't write things down they know about the data collection as much as they should) to get a sense of what the data actually do capture.

    Very difficult area with especially taxing privacy concerns, about 1/10 of the funding needed to do it properly with pressures to make possibly exaggerated claims about their "products".


  4. Keith:

    Yes, this stuff is hard. Unless the adjustment is really obvious (for example simple age-adjustment by 5-year cohorts), I think the best thing is to present the unadjusted numbers along with adjusted, and an explanation of how they differ. I know nothing about the details of the Dartmouth atlas; my comments above are more in the lines of general thoughts.

  5. Sites that make money on advertising tend not to want you to leave. The issue of NY Times linking to blogs is a bit trickier.

    For one thing, linking to a blog presents a maintenance nightmare for a newspaper. What happens when the blog changes URLs or shuts down, or worse yet, someone acquires the URL after the blogger leaves it and uses it for a porn site or worm/virus spreader? These aren't big issues for an academic stats blog, but the NY Times?

    Another issue is that all of the news agencies from the NY Times to the Wall St Journal to Thomson-Reuters to Bloomberg are engaged in what looks to be the construction of in-house encyclopedias. They have specific pages for entities like people, locations, companies, congression bills, etc., and link their stories to those (and vice-versa). For instance, here's the NY Times's:

    I believe the goal is to become a trusted authority for the encyclopedic information as well as the news. Sort of like an in-house well-curated Wikipedia. I think the Wikipedia's pretty tough competition, especially in timeliness and global breadth.

  6. i used to work at the website of the nyt a couple of years back. All the links to the topic pages are automatically generated. For article-specific external links the producer who publishes the article has to manually add the links. I've seen quite a number of excellent examples where there are plenty of links within an article, but it depends (at least while i was still there) on communication between the reporter/editor and the web producer. Since there are other externals link in the story, it looks like this was a case where the reporter did not ask to have that link included (and the web producer didn't add it on his/her own).

  7. You really should read the email exchange between the researchers and the reporters.

    The article seems quite dishonest in many respects. The researchers clearly answered the question about variable cost of living. In addition, the reporters repeatedly distort the distinction between the Dartmouth work on the differences in overall Medicare costs (which are driven almost entirely by hospital utilization) and the more recent work on differences between hospitals.

    For example, the article says:

    In just one example of this extrapolation, Dr. Fisher, in testimony before Congress last year, summarized his and others’ work by asking, “Why are access and quality worse in high-spending regions?”

    [My bold]

    The article then quotes from the Dartmouth web site about evidence that suggests that hospitals with higher utilization might have better outcomes.

    Next, comes this:

    While a few studies by other researchers have shown that more spending leads to worse health, some others have suggested the opposite — that more expensive hospitals might offer better care. But many have shown no link, either way, between spending and quality.

    In other words, there is little evidence to support the widely held view, shaped by the Dartmouth researchers, that the nation’s best hospitals tend to be among the least expensive.

    This is completely disingenuous. The reporters impute a view to the researchers that researchers don't hold by using the weasel words "shaped by" while simultaneously distorting what we really do know. There is very solid evidence at the regional level that higher costs are mostly due to higher rates of hospitalization.

  8. As a journalist not a statistician, I enjoyed reading their article and then your blogpost. They've both left me pretty confused though. For example..ok maybe higher numbers of people dropping dead is far-fetched, but it was compelling to me when they said that NJ comes in dead last for Dartmouth, but almost top when it comes to quality of care. I felt this reinforced their point that maybe higher numbers of people were dropping dead.

    As a journalist, I take your advice that this is something that could be verified or posed to experts. Also, is this logic flawed in some major way? Does one have no impact on the other?

  9. Shubha:

    Yes, I agree that the NJ thing is interesting. My point was not that Abelson and Harris were wrong, or that the Dartmouth people were right; my point was that Abelson and Harris were mixing in good arguments with bad, rather than stepping back and taking a look at the big picture–something that I suspect they could do, if they could get the perspective of outside experts, rather than merely using these experts to make specific points in the larger, messy article.

  10. Re the notorious graph, is it possible that one of the reasons more money is spent per capita on health care in the U.S. is that health care costs more relative to other goods and services in the U.S. than in other countries? How would the graph look if health care costs were expressed not in terms of U.S. dollars but in terms of U.S. health care dollars, ignoring the contributions of other types of goods and services when adjusting for purchasing power? To what extent are Americans getting more health care, rather than spending more for the same amount of care?

  11. Maggie Mahar points out that a number of people quoted in the story feel that though the quotes were literally correct they were in substance misrepresented:

    I agree with William Ockham that the article seems dishonest and slanted if you look at it in toto — but not in a way that is obvious to folks unfamiliar with these issues, unfortunately. And even if you are they manage to confuse things rather than clarify issues while taking a different stance.

  12. Dr. Gelman,
    I wanted to point you to our response to complaints from the Dartmouth researchers that our June 3 piece was inaccurate ( The response demonstrates conclusively that the Dartmouth researchers have mischaracterized their own research, which I think is rare in academia.

    I also wanted to respond directly to your questions. Your primary criticism was that we applied a shotgun approach in our story, and to that I plead guilty. It is hard to rank order the many problems with the Dartmouth Atlas data. Is the worst problem that it fails to adjust for differences in prices or illness, or that it fails to take into account potential qualitative differences in care provided? All are serious shortcomings. Which is the most important? I frankly don't know.

    And your confusion about our criticism of the Atlas (which offers largely unadjusted data) and the 2003 Annals study (where many of these adjustments were made) is understandable for someone who doesn't know the field.

    Finally, you pointed out that the cost-of-care map does not seem terribly affected by cost-of-living adjustments. But remember that (as we pointed out) those maps are unadjusted for both price and illness. And as you look at those maps, it seems clear that the illness adjustment may be the more important of the two. The Southeast — Texas, Louisiana, Alabama, etc. — looks terrible on those maps. The Southeast also happens to have the highest proportion of uninsured patients in the country.

    What appears to be excessive medicine in those maps may be a function less of greedy doctors and more of patients who become eligible for Medicare at the age of 65 with a host of chronic unaddressed health problems. Medicare's high costs, therefore, may simply be cost-shifting from a stingy private sector.

    The more fundamental issue that we face here is how to tackle highly technical issues like statistics in a lay publication. After much discussion, we have decided that we may in the future post online explainers about the technicalities that might reassure folks like you but would largely be lost to lay readers.

    Still, I applaud you for writing about healthcare. It's a difficult subject but incredibly important, and we need more smart people looking at its underlying data. If you have questions in the future, don't hesitate to call us at the Times and we'll walk you through the issues.


  13. Gardiner:

    Thanks for the note. It's clear that you've thought a lot about these problems and that you know a lot more about health care policy (as well as specifics of the Dartmouth research) better than I do. I agree that it's an important research area, and I can appreciate your frustration in reporting on a study and ending up in an adversarial position regarding the academic researchers. I personally find it frustrating when researchers whom I criticize defined their claims on spurious grounds.

    That said, I recommend that you run your articles past qualified experts–not me, but actual experts, people like David Cutler etc.–to get some outside perspective. It might be that such an outsider, talking with you, might have been able to isolate that Medicare-availability point which you note above but I had not noticed in the original article.

  14. Dr. Gelman,
    Thanks for the suggestion. We are specifically prohibited from sharing pre-publication drafts of our stories, and for good reason. There was a scandal some years ago at The Wall Street Journal about a writer for the "Heard on the Street" column sharing pre-publication drafts of stories about companies that moved markets. Stories in The Times can have profound effects on people and companies, and we cannot share them selectively.

    Having said that, I have long had a policy of reading relevant portions of stories back to experts and even to those featured in stories to ensure accuracy and to make sure people are not surprised. Even that doesn't always work, however. Indeed, I read nearly the entire story at issue here to Dr. Elliott Fisher, one of the Dartmouth researchers, prior to publication. Unfortunately, this didn't forestall his later claims of inaccuracies.

    In fact, I wrote another story in February that quoted Dr. Fisher. The quote in that story was taken directly out of an email Dr. Fisher sent to me. Dr. Fisher later told other reporters that I had misquoted him. You can't win sometimes.


  15. Gardiner: Interesting–I hadn't thought about these sorts of disclosure rules. It also sounds frustrating for you to have had to deal with a moving target. Just to clarify my suggestion above: my idea was not just to get feedback from the Dartmouth people but also to run your story by an outside expert.

  16. Rather than demonstrating conclusively that the Dartmouth researchers have misrepresented their research, the response of Harris and the NYT merely shows that the paper is committed distorting the truth to push a pre-determined story line. The writers make several claims they knew or should have known are false. Professor Gelman pointed out one in the original blog post. "For all anyone knows" is a phrase with a specific meaning and Mr. Harris can't hide behind the idea that they were only discussing one edition of the Dartmouth Atlas. He had clear, credible information that the assertion he made was false. He put it in his story anyway. What more needs to be said?

  17. Gardiner:

    Since I've got you on the line, so to speak, I have one other question (inspired by Ockham's quote immediately above). In your news article, you wrote:

    For all anyone knows, patients could be dying in far greater numbers in hospitals in the beige [low-spending] regions than hospitals in the brown [high-spending] ones, and Dartmouth's maps would not pick up that difference.

    Do you really think that it's possible that patients are dying in far greater numbers in hospitals in the beige (low-spending) regions of the map? Especially given that, in your words, "a 2003 study found that patients who lived in places most expensive for the Medicare program received no better care than those who lived in cheaper areas"?

    Just to emphasize here: I know next to nothing about this topic, I'm not taking any sides here, and this is not intended to be a trick or gotcha question. I'm just trying to track down what's going on here with the different claims and counter-claims. Thanks for your help.

  18. Dr. Gelman,
    Sorry to take so long to respond. I had failed to see your question until now. The story actually answered this question when we showed how the relative rankings of hospitals within Wisconsin changed depending upon the measures used. The ranking of St. Mary's, for instance, dropped from 11th in Dartmouth's cost ranking to 67th in the state, or second-to-last, on a ranking based on federal patient mortality scores. The same story happens with states. As we said in the story, New Jersey ranks dead last in the Dartmouth Atlas because of its costs per Medicare patient. But in a federal quality measure based largely on mortality, New Jersey is second only to Vermont.

    The objective of the "For all anyone knows…" paragraph was to point out that Dartmouth's cost rankings are devoid of any quality or outcome measures. As New Jersey shows, care can be both expensive AND high quality. Again, New Jersey is deep brown on the Dartmouth map. But its hospitals, in aggregate, have lower mortality rates than those in the beige regions. Does that answer your question? And if you're asking more broadly whether mortality scores can vary by hospital, the answer is yes.

    The larger point of the paragraph was to knock down the notion that the Dartmouth research has proven a negative correlation between spending and outcomes (that the beige regions have better care than the deep brown ones) — a notion advanced by the researchers themselves and broadly accepted on Capitol Hill and by many experts and journalists.

    And let's think for a minute about what such a negative correlation would mean. Since healthcare spending increases every year, a negative correlation between spending and outcomes in an absolute sense would mean that patients in this country got sicker each year. Does that make any sense to you?


    PS And just to reiterate, the ban on sharing stories prior to publication covers everyone — even supposedly neutral experts (and if you find any neutral experts, let me know). Also, since there was nothing factually wrong with the story at issue here, I would argue that what you really wanted was someone to give us advice about better explaining some of these issues. That's what editors are for, and we have the best in the business here. I'm sorry the process didn't seem to work for you on this one, but I think that's because you are looking for a level of expert clarity that we have trouble providing in a lay publication. As I said, we are considering remedying this by posting web-only explainers for the expert audience.

Comments are closed.