Don’t worry, the post will be coming . . . eventually

Jordan Anaya sends along a link and writes:

Not sure if you’re planning on covering this, but I noticed this today. This could also maybe be another example of the bullshit asymmetry principle since the original paper has an altmetric of 1300 and I’m not sure the rebuttal will get as much attention.

I replied that, yes, I was informed about these papers several months ago, and I wrote a post which is scheduled for 20 Apr.

11 thoughts on “Don’t worry, the post will be coming . . . eventually

    • Interesting – about 30 years ago I did some preliminary work with Donald A. Redelmeier on the use of cell phones and crashes and he then teamed up with Rob Tibshirani to finish and publish it. I had very carefully reviewed the drafts for Donald and recall them being very sensible.

      It also received a lot of press coverage https://www.ncbi.nlm.nih.gov/pubmed/9017937 and I think Rob even got an award from the University of Toronto for that work.

      I should re-read that old paper and the new one.

      • glad to hear that. Looking at this write up https://www.nytimes.com/2010/08/31/science/31profile.html it seemed fairly likely that he build a career on cherry picking noise.

        His April-20 paper’s analysis is totally inexcusable and I hope it prompts a reexamination of his other notable discoveries (“Win an Academy Award and you’re likely to live longer than had you been a runner-up. Interview for medical school on a rainy day, and your chances of being selected could fall.”)

  1. From the paper: “The 25-year study interval identified 1.3 million drivers involved in 882 483 crashes causing 978 328 fatalities. Intotal, 1369 drivers were involved in fatal crashes after 4:20PMon April 20 whereas 2453 drivers were in fatal crashes on con-trol days during the same time intervals (corresponding to7.1 and 6.4 drivers in fatal crashes per hour, respectively).The risk of a fatal crash was significantly higher on April 20(relative risk, 1.12; 95% CI, 1.05-1.19;P= .001)”

    This is the kind of thing i am increasingly more worried about concerning large data sets, especially analyzing existing ones.

    I have brought this possible problematic issue up before on this blog, but i am still wondering whether analyzing existing (large) data sets probably leads to a whole new form of “p-hacking”, “HARK-ing”, and “selective reporting”.

    I reason the papers (and analyses) coming from these large pre-existing data sets will:

    1) almost certainly not control for multiple testing/analyses (e.g. because how could i know who else is/has been analyzing the data sets to subsequently correct for multiple testing/analyses),

    2) they will probably (consciously or unconsciously) encourage HARK-ing (because i have probably read the 1st paper using the large data set, and/or took a peak at the data, so i will probably be influenced by what i read), and

    3) i can totally see how people will look (or have looked and now forgotten that they did) in these large data sets for finding X, and only write a paper about finding X when they find what they want to find, and/or subsequently want to publish.

    I wonder of my reasoning is correct…

    • If any of this reasoning makes sense (i am not well-versed concerning statistics), perhaps that could make for an interesting, and possibly useful paper.

      You could perhaps even try and find a set of papers that used a certain large pre-existing data set, and look at all the findings that have been published over the years concerning that specific data set but now correct them for multiple testing (if that has not been done in all the seperate papers) to see how many are left that are still “significant”.

      Does that make any sense?

    • Anon:

      I don’t think that Harking (“hypothesizing after results are known”) is a bad thing, nor do I think that it is best practice to “control for multiple testing/analysis.” Instead, I think it is better to study all comparisons of interest and analyze them using a multilevel model, as discussed here.

      • Thank you for the link to the paper.

        I am wondering whether you think HARK-ing is perhaps not a bad thing, and whether controling for multiple testing/analyses is perhaps not necessary, because you propose, and could have been thinking about, an alternative perspective: a “Bayesian multilevel perspective” instead of a “classical type 1 error” perspective.

        If HARK-ing, and not controling for multiple testing/analyses are actually still problematic in the classical type 1 error statistical framework (or what’s the appropriate term here), i wonder if the points i am trying to raise may still hold for analyses related to the “classical type 1 error” perspective.

Leave a Reply to Keith O'Rourke Cancel reply

Your email address will not be published. Required fields are marked *