Harking, Sharking, Tharking

Bert Gunter writes:

You may already have seen this [“Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data,” John Hollenbeck, Patrick Wright]. It discusses many of the same themes that you and others have highlighted in the special American Statistician issue and elsewhere, but does so from a slightly different perspective, which I thought you might find interesting. I believe it provides some nice examples of what Chris Tong called “enlightened description” in his American Statistician piece.

I replied that Hollenbeck and Wright’s claims seem noncontroversial. I’ve tharked in every research project I’ve ever done.

I also clicked through and read the Tong paper, “Statistical Inference Enables Bad Science; Statistical Thinking Enables Good Science.” The article is excellent—starting with its title—and it brings up many thoughts. I’ll devote an entire post to it.

Also I was amused by this, the final sentence of Tong’s article:

More generally, if we had to recommend just three articles that capture the spirit of the overall approach outlined here, they would be (in chronological order) Freedman (1991), Gelman and Loken (2014), and Mogil and Macleod (2017).

If Freedman were to see this sentence, he’d spin in his grave. He absolutely despised me, and he put in quite a bit of effort to convince himself and others that my work had no value.

Tomorrow’s post: “Statistical Inference Enables Bad Science; Statistical Thinking Enables Good Science”

9 thoughts on “Harking, Sharking, Tharking

  1. Andrew,

    Why did Freedman despise you? Was it because you were a Bayesian?

    I’ll admit, I find the Freedman worship difficult to understand, he often seemed better at tearing down ideas than building up better ones.

  2. Some:

    I think it was some combination of Bayesian, applied, and social science that bothered him. Any two of the three might well have been OK with him. Also, he was immersed in an anti-Bayesian tradition; see here, here, and here for some perspective.

    Adede:

    It wasn’t a vendetta. He just didn’t respect what I did, he spent approximately zero time understanding my work but spent a lot of time trying to come up with reasons to attack it. I think he just didn’t understand why I was doing and was intellectually committed to thinking it was wrong, and he was willing to tell untruths in order to push his views. For obvious reasons, this pissed me off, but I don’t think it was personal for him. I’m guessing he was just sure he was correct on the merits and that, from his perspective, any tactics were valid for enforcing this view.

    • Andrew,
      Your Wikipedia page is horrible. I guess you’re not supposed to edit it yourself, but maybe some readers of this blog can add some material: some of your better articles, etc. I went there hoping to find the year you got the American Statistical Association award for “Best Statistician under the age of 40” or whatever it was, but the award isn’t even mentioned. Fortunately, since Freedman’s page includes his death year — jeez, yours doesn’t even tell us _that_ about you! — it is easy to confirm that yeah, he knew you had won it. I know you had too much class to do a victory lap through the Berkeley stats department when you won, but I think it’s a pity that you didn’t. In general I feel like yeah, if there’s a classy way to do something then do it that way, but there are exceptions.

      Freedman is entitled to his belief that what you did (or at least what you did prior to 1995 or whenever it was) had no value — it’s ridiculous but I suppose he’s allowed to believe it — but he was not entitled to falsely claim in an important setting that you had only worked on linear models when in fact you had done (for example) groundbreaking work on incorporating prior information into pharmacokinetic models with Frederic “Freddy Forest” Bois, as Freedman knew perfectly well.

      I know I wrote this as if it’s a comment to Andrew, but of course it’s really meant for the other readers of this blog.

  3. I miss Michigan State. Norb Kerr ran his argument about HARKing past us in the “cognitive brownbag” lecture series when I was in graduate school. At the time I think there were at least three editors in the audience, among the faculty.

    I hope that Norb knows that the discussion hasn’t ended.

  4. Study #2 reads like a fairy story.
    Let me summarise. As a neo-Fisherian.
    Initially they find a correlation (I’m assuming Pearson’s(?)) of 0.1 from a sample size of 100. This corresponds to a P-value of 0.3 and a likelihood ratio (likelihood of effect size>0 / likelihood of effect size<0) of 1.2.
    Interpretation: this result is very close to equivocal but since we don't have the minimum clinically significant effect size it's not possible to know if the sample size is adequate.
    Recommended course of action: stop if sample size is adequate, collect more data if it's not.
    What they did: share anecdotes and then decide without any good reason to do a sub-group analysis.
    For the sub-group of females they find correlation = 0.2 and (assuming n=50) the P-value is 0.16 and LR is 1.9.
    Interpretation: very weak evidence of a positive effect, could be considered as hypothesis generating
    Correct course of action: Perform a new trial looking at this subgroup alone. Make it a good trial.
    What they did: Made up some rubbish about an interaction with oestrogen and then did a sub-sub-group analysis.
    For the sub-sub-group they get a "significant" result.
    Interpretation: How lucky is this? Is there a fairy god-person involved somewhere?
    Correct course of action: Do a trial of the sub-sub-group if you must. Make it a really good trial.
    What they did: Spruik! Publicise! Tour the world! (This was the course of action taken by the Early Goal Directed Therapy (EGDT)of sepsis group of Emmanuel Rivers et al.)
    Happy-ever-after ending: The results replicated, lives were saved. Yay.
    Reality-sucks ending: The result doesn't replicate, vast amount of resources were wasted. (This happened with EGDT)

    It's not just tharking, it's wishful-tharking.

  5. Also this from Christopher Tong:
    “Moreover, we contend that the well-established distinction between exploratory and confirmatory objectives provides a framework for understanding the proper roles of flexible versus prespecified statistical analyses. Unfortunately, we think that in much of the current use of inferential methods in science, except in specialized fields such as human clinical trials, this distinction is absent.”
    This relevant to the not infrequent criticisms of medical science made on this blog. (Not to mention the recent spate of doctor-bashing).

Leave a Reply to Nick Adams Cancel reply

Your email address will not be published. Required fields are marked *