Why is it so hard for them to acknowledge a correction?

Anne Case (as quoted by Jesse Singal):

We spent a year working on this paper, sweating out every number, sweating out over what we were doing, and then to see people blogging about it in real time — that’s not the way science really gets done. . . . And so it’s a little hard for us to respond to all of the blog posts that are coming out. . . . And if this is all people shooting from the hip, I don’t think that’s any way to move science forward, to move the research forward.

Angus Deaton (as quoted by Kate Murphy):

There’s been a tremendous amount of recalculating of our numbers in the media and blogs. It’s all happening at the speed of light. You do worry whether it’s being done appropriately. But we’re not complaining.

“We’re not complaining”? That’s the best you can do???

How bout this:

We spent a year working on this paper, sweating out every number, sweating out over what we were doing, and we’re happy to see see people blogging about it in real time.

We very much appreciate the effort put in by Laudan Aron, Lisa Dubay, Elaine Waxman, and Steven Martin, Philip Cohen, and Andrew Gelman to uncover the aggregation bias in our analysis, to correct for that bias, and to explore subtleties that we did not have a chance to get into in our paper. As Gelman noted, these corrections are in no way a debunking of our work—our comparisons of non-Hispanic American whites to groups in other countries and other ethnic groups still stand.

We think it’s great that, after our paper was published in PNAS, it was possible to get rapid feedback. Had it not been for bloggers, we’d still be in the awkward situation of people trying to trying to explain an increase in death rates which isn’t actually happening. We join Paul Krugman and Ross Douthat in thanking these bloggers for their unpaid efforts on the behalf of everyone interested in this research. We count ourselves lucky to live in an era in which mistakes can be corrected rapidly, so that we and others do not have to wait months or even years for published corrections which themselves could contain further errors.

As economists, we recognize that research work is always provisional, and that anyone studying the real world of human interactions has to accept that mistakes are part of the process. It is only through the efforts of our entire research community—publishing in journals, publishing in blogs, through informal conversations, whatever—that we move toward the truth. We always considered our PNAS paper to be just a single step in this process and we are glad that others have taken the trouble to correct some of our biases and omissions.

Again, we thank the many researchers who have taken a careful look at our analyses. It’s good to know that our main findings are not affected by the corrections, we welcome further research in this area, and we hope that future discussion of our work, both in the scientific literature and in the popular press, make use of the corrected, age-adjusted trends.

– Sincerely, Anne Case and Angus Deaton

P.S. We have heard some people criticize the researchers noted above because they published their work in blogs rather than in peer-reviewed journals. We would never make such a silly, uninformed criticism. Since appearing in print, our work has received a huge amount of publicity. And, to the extent that we made mistakes or did not happen to explain ourselves clearly enough, it is the responsibility of others to publish their corrections and explanations as rapidly as possible. Blogs are a great way to do this. Blogs, unlike newspaper interviews, allow unlimited space to develop arguments and to present graphs of data. And we are of course aware that peer-reviewed journals make mistakes too. We published our paper in the Proceedings of the National Academy of Sciences, a journal that last year published a notorious paper on himmicanes and hurricanes, another discredited paper claiming certain behavior by people whose ages end in 9, and another paper on demographics which neglected to apply a basic age adjustment. So, yes, publication in journals is fine, but we very much welcome researchers who are willing to stick out their necks and correct the record in real time on blogs.

See, that wasn’t that hard!

Case and Deaton did great work. No need for them to get so defensive and attack-the-messenger-y about it. Indeed, I had an email exchange with Angus Deaton a few days ago and he was completely polite and reasonable, directly pointing me to the data set that they had used in their paper. In my opinion, Case and Deaton have acted as exemplary scientists in this entire episode, with only the very minor exception of a difficulty in graciously handling public corrections of their work. Hence my disappointment. Really only a small thing in the grand scheme of things, but it still bugs me, hence this post.

And, contra Case, this sort of give-and-take, whether it appears in blogs or wherever, is exactly the way science really gets done. Read your Lakatos.

P.P.S. Regarding the line above, “trying to trying to explain an increase in death rates which isn’t actually happening”: As I’ve emphasized throughout, the comparison to other countries remains interesting, as the age-aggregation bias is small compared to the observed declines of 20% in mortality rates in other countries and among other groups. But in the U.S. press there’s been a lot of explanation of why things are getting worse here, not just in a comparative but in an absolute scale. And, as the age-adjusted data show, the mortality rate in this group has been flat since 2005—increasing for women, decreasing for men, with an average change of just about zero. If people want to explain that, fine, but Case and Deaton have all of us to thank for the fact that people can now be focused on explaining this pattern, not an artifactual pattern of a steady increase in death rates which is what everyone was talking about before.

Or, as discussed in the comments, Case and Deaton don’t have to thank me, or Laudan Aron, Lisa Dubay, Elaine Waxman, and Steven Martin, or Philip Cohen, at all. But could they please refrain from slamming us? I’d like for Case and Deaton to show our work the same respect and consideration that we are showing theirs.

41 thoughts on “Why is it so hard for them to acknowledge a correction?

  1. >”We spent a year working on this paper, sweating out every number, sweating out over what we were doing”

    Yet they do not deal with the possibility this is due to “a combination of a) changes in reporting/coding b) binning of ages and c) cohort issues.”[1]

    They needed to attempt accounting for various analysis/collection artifacts before attributing the effect to something real. If they really did put forth all that effort, I suspect they concluded this data is too messy for a real analysis and then tried to publish something anyway. The lack of age adjustment remains bizarre to me, that is data analysis 101 stuff.

    [1] http://statmodeling.stat.columbia.edu/2015/11/06/what-happened-to-mortality-among-45-54-year-old-white-non-hispanic-men-it-declined-from-1989-to-1999-increased-from-1999-to-2005-and-held-steady-after-that/#comment-251274

    • Anon:

      I’m working on a post on this for tomorrow, but my quick guess on the lack of age adjustment is that they naively thought that 10-year bins would be narrow enough for age adjustment not to matter. I don’t think there’s any practical need to age adjust within 1-year bins, for example.

      Their mistake on the binning was quantitative, not qualitative—and, as we’ve discussed many times on this blog, social science is often understood in a qualitative way: Is the effect there or not, are death rates increasing or decreasing, whatever.

      One frustration I had with the first round of this discussion—back when I (following your suggestion) did a back-of-the-envelope estimate of the bias in Case and Deaton’s trends—was to see the reaction that the correction didn’t really matter since the death rates were still increasing, even after age adjustment. It was frustration with this argument that motivated me to look at the data more carefully and break it up into the 1999-2005 and 2005-2013 periods, and to look at men and women separately, so that the “it’s still increasing” argument couldn’t really apply.

      In her recent interview, Case seemed to be disparaging age adjustment on the silly grounds that different methods of age adjustment will lead to different answers (as if that would somehow imply that unadjusted estimates are ok), but I think at this point this was just defensiveness. I suspect that, had a reviewer at a journal recommended a simple age adjustment, that Case and Deaton would’ve just done it and never looked back. They then would’ve talked about the 1995-2003 trend and we’d never be having this discussion. Unfortunately it seems that two journals rejected the paper with no reviews, and the third accepted it without any reviewer catching the problem. As Case and Deaton note in their (hypothetical) letter in the above post, this is just an example of how post-publication peer review can be a good thing.

      • >”I don’t think there’s any practical need to age adjust within 1-year bins, for example. Their mistake on the binning was quantitative, not qualitative”

        I am not so sure this couldn’t account for a nice chunk of it. The mortality increase we are talking about is about the same as what is observed due to aging one year (~40 deaths/100k), so we are right on the edge of changing the qualitative claim as well:

        “First, Urban is working with different numbers for the increase in mortality rates per 100,000 people between 1999 and 2013. Urban claims increases of 26.8 and 7.7 for women and men, respectively; Case said they saw increases of about 40 and 28 — higher numbers, but less of a gap.”

    • Why do they not mention this?
      “Death rates for AIAN, API, and, to a lesser extent, Hispanic populations are known to be too low because of reporting problems (see ‘‘Race and Hispanic origin’’ in the Technical Notes).
      […]
      The 2003 revision of the U.S. Standard Certificate of Death allows the reporting of more than one race (multiple races) (19). This change was implemented to reflect the increasing diversity of the population of the United States, to be consistent with the decennial census, and to reflect standards issued in 1997 by the Office of Management and Budget (OMB).
      […]
      Death rates for Hispanic, AIAN, and API persons should be interpreted with caution because of inconsistencies in reporting race on death certificates compared with such reporting on censuses, surveys, and birth certificates. Studies have shown underreporting on death certificates of AIAN, API, and Hispanic decedents, as well as undercounts of these groups in censuses (4–6).”
      http://www.cdc.gov/nchs/data/nvsr/nvsr61/nvsr61_06.pdf

      This pattern in the mortality data is interesting and should be investigated, but it has all the hallmarks of some kind of artifact.

      • Clicking a link in that previous document, we find this bizarre algorithm being used to generate “random” numbers to get the mortality and population estimates by race:

        >”In the concluding slide for Example 1, the bridging process is illustrated. First, the edited codes from the previous slide are collapsed into the four race groups specified in the 1977 OMB Standards for Race and Ethnicity; in this case we have only White and Asian or Pacific Islander (API)—a two-race combination. Using the bridging algorithm model for API/White, along with data on county of residence and other covariates for this case, we find that the probability is 60% that White would be the main race and 40% that API would be selected as main race. Next, these percentages are converted to adjacent intervals in the range from 0 to 100. The next step is to select a random number between 0 and 100. To do this, the program takes the last three digits of the certificate number and inverts them into a number with one decimal place. In this particular example, the last three digits are 433, which inverted yield 33.4. Since 33.4 lies between 0 and 60, White is chosen as the main race for this case.”

        http://www.cdc.gov/nchs/data/dvs/Multiple_race_docu_5-10-04.pdf

        Why don’t they use a real PRNG here? I am getting less and less confident in this race and hispanic origin information.

      • I was trying to find out where these death certificate numbers come from and clicked “dataset documentation” here: http://wonder.cdc.gov/ucd-icd10.html

        “Information included on the death certificate about the race and Hispanic ethnicity of the decedent is reported by the funeral director as provided by an informant, often the surviving next of kin, or, in the absence of an informant, on the basis of observation. Race and ethnicity information from the census is by self-report. To the extent that race and Hispanic origin are inconsistent between these two data sources, death rates will be biased. Studies have shown that persons self-reported as American Indian, Asian, or Hispanic on census and survey records may sometimes be reported as white or non-Hispanic on the death certificate, resulting in an underestimation of deaths and death rates for the American Indian, Asian, and Hispanic groups. Bias also results from undercounts of some population groups in the census, particularly young black males, young white males, and elderly persons, resulting in an overestimation of death rates. In ” Quality of death rates by race and Hispanic origin: A summary of current research, 1999,” the authors estimate that the misclassification and under-coverage result in overstated death rates for the white and black populations (1% and 5%, respectively) and understated death rates for other population groups (American Indians, 21%; Asian or Pacific Islanders, 11%; and Hispanics, 2%). See also The validity of race and Hispanic Origin reporting on death certificates in the United States.”

        http://wonder.cdc.gov/wonder/help/ucd.html#

        tl/dr:
        “Studies have shown that persons self-reported as American Indian, Asian, or Hispanic on census and survey records may sometimes be reported as white or non-Hispanic on the death certificate, resulting in an underestimation of deaths and death rates for the American Indian, Asian, and Hispanic groups. Bias also results from undercounts of some population groups in the census, particularly young black males, young white males, and elderly persons, resulting in an overestimation of death rates.”

        I don’t know how much this type of issue will affect the final analysis, but it all needs to be taken into account…

    • I wonder if they ever presented this at a workshop. In economics, especially at a big-name place like Princeton, presenting your paper is an interactive process in which the audience makes comments along the way and tries to find all the holes in the paper. This age adjustment is exactly the kind of thing that’s apt to come up— something important, and not that profound, but which the authors overlooked.

  2. Or could it simply be that the data was easier to obtain using binned ranges? When I accessed the CDC Wonder site, the detailed cause of death data along with demographic and geographic breakdowns was much easier to get than if I listed single ages. The single age files were often larger than CDC Wonder would allow you to display and download (limited to 75,000 rows of data) and the Leading Causes of Death data is only available for binned age ranges (detailed causes of death are available for single ages).

    • Dale:

      Yes, I suspect that data availability was a big part of it. But once you realize that age aggregation can be causing a big bias, you have to correct for it one way or another, either by getting data per single year of age or by doing some sort of model-based bias correction.

      So I’m guessing that Case and Deaton simply didn’t think of this particular bias at all, or, if they did think of the bias, that they assumed it would be small.

      Yet another factor, which has not come into this discussion yet, is that in social science and particularly in economics, data display and data summary are considered to be less serious than regression modeling. So it’s possible that Case and Deaton saw these trend graphs as just exploration, no big deal, hence not requiring careful thought about aggregation bias and measurement error.

  3. I’d be tempted to give them a break. Deaton just got the Nobel, so it’s likely they’re getting more press attention than they’ve ever seen, and doing their best to deal with it. All this while getting ready to go to Stockholm — and write a paper for the occasion.

    • Dave:

      Fair enough. They’re busy enough without having to write a long letter acknowledging the contributions of various post-publication reviewers. But then why can’t they simply make neutral remarks? What good is served by potshots such as “then to see people blogging about it in real time — that’s not the way science really gets done”? That’s the gatekeeper mentality at its worst, and it’s entirely unnecessary.

  4. Part of it, I guess, is exactly the speed of the news cycle. Suppose, Case and Deaton do not trust that Prof. Gelman and others did a reasonable and accurate job for correcting/disaggregating the data. They simply don’t know either way what they think about the subject. Maybe it’s important, maybe not, or important for what exactly and, first of all, whether others did it right. I wouldn’t be too comfortable to require people to figure it out in a week. Having said that, others cannot just wait until Case and Deaton will be ready.

    • D.O.,

      No need for them to trust my blogging. But, then again, no need for me to trust something that appeared in PNAS. I have complete sympathy with Case and Deaton if they are tired of this topic and want to work on something else. It’s not their duty to check that my analyses are correct.

      On the other hand, if they really spent a year on this project and made such basic mistakes in analysis and communication, I think it would be appropriate for them to thank those of us who noticed and corrected their errors. They could do this even while reserving judgment on the correctness of our analyses, and they could certainly avoid potshots such as “then to see people blogging about it in real time — that’s not the way science really gets done.” They have every right to move on to new research projects. But as far as I’m concerned they have no mandate to disparage the research of those of us who are looking at these data.

      And, one more thing, this is a point I’ve made before: I don’t hold it against Case and Deaton that they worked hard on this project for a year without detecting fundamental errors in analysis. This has happened to me too. And when it happened, I thanked the people who pointed out my errors to me. Not a grudging “we’re not complaining.” A simple thanks.

      • I generally agree with you. My suggestion was that they are just overwhelmed with the speed of it.

        By the way, there used to be(?) a mechanism to try and get most of this stuff right by circulating working papers, preprints, pre-publication seminars, conferences etc.

        • D.O.:

          Yes, I agree about the overwhelming speed. I did email links to a couple of these posts to Deaton, but I think he was pointed to them by others in any case.

        • It’s not all that fast. What’s more common in economics is something like this: you’re giving a seminar and twenty minutes into it someone says, “Aren’t 10=year bins too big given that the population is aging?” and you’re expected to reply within thirty seconds, even if only to say, “Well, gee, I didn’t think of that. It might well eliminate all my results. But let’s go on with the next hour of the workshop anyway, cause maybe it will not be a big problem.” It’s wonderful to have a blog post that both raises the problem and shows the limited extent of it, particularly if you’ve already published the paper and can let someone else publish a new paper correcting yours while giving you the honor of coming up with the important part of the idea.

      • Perhaps this is just my lack of English skills, but why exactly do you think that there is anything ‘grudging’ about this quote? As I read it, you cut the full quote as it would represent his line of reasoning: his ‘we are not complaining’ is sincere. He’s just alluding that there is a lot of spurious stuff about the paper out there, but (as he adds in the following sentence) that it is still great to have it that widely read and talked about. Why would you think that the quote is about you or people doing serious discussion od the paper, at all – especially given your exchange with him? Don’t forget that this stuff is a topic among the Ron Unzes of this world, too, and wondering ‘whether it’s being done appropriately’ might be a rather obvious worry that at the same time has nothing, whatsoever, with e.g. you…

        • Martin:

          Maybe you’re right, it could be a British-English thing. I think that if all I’d seen was the “we are not complaining” line, I wouldn’t have been bothered. It’s when coupled with the Case quote (“that’s not the way science really gets done. . . . And so it’s a little hard for us to respond to all of the blog posts that are coming out. . . . And if this is all people shooting from the hip, I don’t think that’s any way to move science forward, to move the research forward”), that I got really upset. “We are not complaining”: Sure, that alone could just be British understatement.

      • “they could certainly avoid potshots such as “then to see people blogging about it in real time — that’s not the way science really gets done.””

        I don’t see how the comment “then to see people blogging about it in real time — that’s not the way science really gets done” is a potshot — I see it as someone expressing their understandable befuddlement and difficulty adapting quickly to sudden immersion in new ways of doing science that conflict with what they learned and have known as how science was done. Frankly, to me, calling this “the gatekeeper mentality” does sound like a potshot.

        “… as far as I’m concerned they have no mandate to disparage the research of those of us who are looking at these data.”

        I don’t see how what they have said or done is disparaging the research of others who are looking at these data.

        “… And when it happened, I thanked the people who pointed out my errors to me. Not a grudging “we’re not complaining.” A simple thanks.”

        I don’t see the comment “we’re not complaining” as grudging — just an understandable response to what was quite a deluge — an unexpected deluge. As D.O. says and you agreed, they were undoubtedly overwhelmed. You can’t expect perfection when people are overwhelmed.

        • Case did not say “that’s not the way we are used to science being done / that’s not the way science used to be done when we were starting out!” she said “that’s not the way science really gets done” which implies that the commenters were not doing science and are unscientific, and that she and Deaton were doing science / being scientific in comparison.

        • Rachael,
          I would say that “implies” is too strong here: You have made an inference from Case’s statement (which you have a right to do), but that does not mean that Case’s statement necessarily implies what you infer.

  5. I do think the incredible speed is a factor- it’s only been up as an ‘in press’ manuscript for a week or two!’ I imagine the constant ‘ping’ of incoming emails and pings to be simply overwhelming.

    All that said, if you are putting something out there that is so surprising and unexpected that has been ‘hiding’ in plain sight, you should stress-test the hell out of analysis, stratify by everything possible, and try really hard to break it.

    Finally, the incredibly byzantine ‘bridging’ scheme linked above would give me nightmares if I needed to ensure robust and reproducible analyses!

  6. “‘…that’s not the way science really gets done.'”

    Contra Case’s remarks: I believe leading Economics journals, tabloids and foundations should offer bounties for the sort of work Gelman and others have done on this paper. (I was not paid for this suggestion.) Award on a sliding scale; the firstest and bestest gets the mostest. It would do wonders for humanity, and maybe for economics.

    As one who thinks the way “science really gets done” involves reproducible controlled experiments that test theoretical predictions, I don’t see science has that much to do with this. The question is how arithmetic gets done.

  7. The original article is written up by The Guardian here
    http://www.theguardian.com/science/2015/nov/02/death-rate-middle-aged-white-americans-aids?CMP=Share_iOSApp_Other
    with interviews with Case and Deaton about how important and dramatic it is…

    and then *that* article gets cited when The Guardian runs an article on poverty and drug addiction in Appalachia, which is said to contribute to the death rate nationally. And hey, maybe the toll in the region is big enough (the article mentions a lost generation, and 56% of all accidental deaths in Kentucky as being due to overdoses) to have affected the national statistics.
    http://www.theguardian.com/us-news/2015/nov/12/beattyville-kentucky-and-americas-poorest-towns

  8. A few points.

    1) Complaining is OK. It is very different from denouncing criticism!

    I am listing below what seems like acceptable points for Case and Dragon.

    A. The speed and the unforgiving pressure to react in real time.
    Dragon / Case went through what a rape suspect goes through. Please supply your version of events *now*, if not, you will be guilty by default.

    For an academic that worked very hard and meticulously for a year, this is – on a personal level – unfair.

    Is speed of light blogosphere useful? Definitely
    Is it fair with the original authors? Sometimes very unfair.

    B. The media / Twitter misrepresentation
    Many quoted the comments as debunking / refuting the original work.
    This borders on the libelous.

    C. The quality issues. Blogs inherently don’t have the heavy checking etc. of revised journals. This makes the discussion also less useful in an aspect. And more burdensome for original authors as well.

    D. The pro and cons of social media fueled scientific discourse is in itself a complex question. Assuming that it is only positive seems very naive for me. I’ll not dig here. Just remember that there must be various trade offs etc. Unless one is a kid, where everything has no tradeoff

    • Jazi:

      I never asked Case or Deaton to supply their version of events, now or otherwise.

      But, in any case, I think that when Case says that blogging in real time is “not the way science really gets done,” she is misinformed and rude. My graphs are as real as her graphs, and publication in PNAS (the journal that published the himmicanes and hurricanes study, etc) doesn’t make her graphs or her analyses any real than mine.

      Beyond all this, I did point out an error in her paper. I also emphasized over and over again that this was in no way a debunking of their work. Indeed I directly contacted a journalist at one point and asked not to be labeled as a “critic” of that study.

      When people point out errors in my work, I thank them. I don’t go around saying that this is “not the way science really gets done.”

      I agree with you that blogs are not an unalloyed good. There are tradeoffs. But I disagree with the statement that blogging is not the way science really gets done.

  9. Is there an age-adjusted version of their Figure 1? — ie, the correction to white NH American death rates appears more meaningful if you also are adjusting the comparison groups.

    • Doing publication quality age adjustment requires some care. Andrew did a back of envelope calculation of the crude death rate and then took the mean of the CDR across single years. But that’s not the way demographers would calculate an adjusted death rate, if only because there is not the expectation that the population would be evenly distributed or that the death rates for older and younger would be equal. Now, he did kind of use a standardized population that is uniformly distributed over age, so that is good, but it’s not a standardized population that anyone really would think is appropriate (especially if you are arguing that the change of the median age by 6 months is going to have large effects. From my understanding you would also have to consider how best to calculate confidence intervals for the adjusted estimates. It would just be a lot slower. That doesn’t mean it’s not great for people do do back of the envelope calculations that is great, but it’s not the same thing as writing a paper and really thinking about it.

      • Elian, I agree, this is not a simple task. My impression with scientific blogs is that they made a valuable contribution, but often blog analysis are less careful than rigorous research published in sound academic journals (which can take years of work and scrutiny)’, and this is why blogs are not the standard for scientific communication. Sure they are a useful exchange of tentative ideas, also they help to spur creativity or help readers to find out some new useful reference, but in many occasions the analysis is superficial (an exception is the case where the post is discussing a finished paper by its author). So I understand that the original authors do not bother too much with this type of feedback, at least not until the criticism to their originally paper is formally published.

  10. Deaton and Case are economists. The paper is sociology, or public health, or demography. It’s not economics. The critics are sociologists, demographers, statisticians, public health researchers. Has any economist ever taken criticism from a non-economist social scientist with good grace? Or do all economists believe that all other social sciences are worthless and that the only people who know anything are economists?

Leave a Reply to Martin Cancel reply

Your email address will not be published. Required fields are marked *