Don’t worry, the post will be coming . . . eventually

Posted on February 24, 2019 7:49 PM by Andrew

Jordan Anaya sends along a link and writes:

Not sure if you’re planning on covering this, but I noticed this today. This could also maybe be another example of the bullshit asymmetry principle since the original paper has an altmetric of 1300 and I’m not sure the rebuttal will get as much attention.

I replied that, yes, I was informed about these papers several months ago, and I wrote a post which is scheduled for 20 Apr.

11 thoughts on “Don’t worry, the post will be coming . . . eventually”

Jordan Anaya on February 24, 2019 8:03 PM at 8:03 pm said:

The link seems to be missing:
https://twitter.com/Chris_Auld/status/1099342790826254336

Reply ↓
- Andrew on February 24, 2019 8:17 PM at 8:17 pm said:
  
  Jordan:
  
  I purposely omitted the link because I thought it would be more fun for people to guess! But, whatever.
  
  Reply ↓
- Keith O'Rourke on February 25, 2019 8:41 AM at 8:41 am said:
  
  Interesting – about 30 years ago I did some preliminary work with Donald A. Redelmeier on the use of cell phones and crashes and he then teamed up with Rob Tibshirani to finish and publish it. I had very carefully reviewed the drafts for Donald and recall them being very sensible.
  
  It also received a lot of press coverage https://www.ncbi.nlm.nih.gov/pubmed/9017937 and I think Rob even got an award from the University of Toronto for that work.
  
  I should re-read that old paper and the new one.
  
  Reply ↓
  - Anonymous on March 16, 2019 12:20 AM at 12:20 am said:
    
    glad to hear that. Looking at this write up https://www.nytimes.com/2010/08/31/science/31profile.html it seemed fairly likely that he build a career on cherry picking noise.
    
    His April-20 paper’s analysis is totally inexcusable and I hope it prompts a reexamination of his other notable discoveries (“Win an Academy Award and you’re likely to live longer than had you been a runner-up. Interview for medical school on a rainy day, and your chances of being selected could fall.”)
    
    Reply ↓
Anonymous on February 25, 2019 3:35 AM at 3:35 am said:

From the paper: “The 25-year study interval identified 1.3 million drivers involved in 882 483 crashes causing 978 328 fatalities. Intotal, 1369 drivers were involved in fatal crashes after 4:20PMon April 20 whereas 2453 drivers were in fatal crashes on con-trol days during the same time intervals (corresponding to7.1 and 6.4 drivers in fatal crashes per hour, respectively).The risk of a fatal crash was significantly higher on April 20(relative risk, 1.12; 95% CI, 1.05-1.19;P= .001)”

This is the kind of thing i am increasingly more worried about concerning large data sets, especially analyzing existing ones.

I have brought this possible problematic issue up before on this blog, but i am still wondering whether analyzing existing (large) data sets probably leads to a whole new form of “p-hacking”, “HARK-ing”, and “selective reporting”.

I reason the papers (and analyses) coming from these large pre-existing data sets will:

1) almost certainly not control for multiple testing/analyses (e.g. because how could i know who else is/has been analyzing the data sets to subsequently correct for multiple testing/analyses),

2) they will probably (consciously or unconsciously) encourage HARK-ing (because i have probably read the 1st paper using the large data set, and/or took a peak at the data, so i will probably be influenced by what i read), and

3) i can totally see how people will look (or have looked and now forgotten that they did) in these large data sets for finding X, and only write a paper about finding X when they find what they want to find, and/or subsequently want to publish.

I wonder of my reasoning is correct…

Reply ↓
- Anonymous on February 25, 2019 9:05 AM at 9:05 am said:
  
  If any of this reasoning makes sense (i am not well-versed concerning statistics), perhaps that could make for an interesting, and possibly useful paper.
  
  You could perhaps even try and find a set of papers that used a certain large pre-existing data set, and look at all the findings that have been published over the years concerning that specific data set but now correct them for multiple testing (if that has not been done in all the seperate papers) to see how many are left that are still “significant”.
  
  Does that make any sense?
  
  Reply ↓
- Andrew on February 25, 2019 9:37 AM at 9:37 am said:
  
  Anon:
  
  I don’t think that Harking (“hypothesizing after results are known”) is a bad thing, nor do I think that it is best practice to “control for multiple testing/analysis.” Instead, I think it is better to study all comparisons of interest and analyze them using a multilevel model, as discussed here.
  
  Reply ↓
  - Anonymous on February 25, 2019 10:44 AM at 10:44 am said:
    
    Thank you for the link to the paper.
    
    I am wondering whether you think HARK-ing is perhaps not a bad thing, and whether controling for multiple testing/analyses is perhaps not necessary, because you propose, and could have been thinking about, an alternative perspective: a “Bayesian multilevel perspective” instead of a “classical type 1 error” perspective.
    
    If HARK-ing, and not controling for multiple testing/analyses are actually still problematic in the classical type 1 error statistical framework (or what’s the appropriate term here), i wonder if the points i am trying to raise may still hold for analyses related to the “classical type 1 error” perspective.
    
    Reply ↓
Anonymous on February 25, 2019 8:56 AM at 8:56 am said:

“…and I wrote a post which is scheduled for 20 Apr.”

Just noticed: + 1 for scheduling the post on that specific date given the topic :)

Reply ↓
- Daniel Lakeland on February 25, 2019 4:30 PM at 4:30 pm said:
  
  I was thinking that was a little…. far out… far out man.
  
  Reply ↓
- Nahim on February 26, 2019 2:15 PM at 2:15 pm said:
  
  Oh that’s absolutely brilliant.
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Don’t worry, the post will be coming . . . eventually

11 thoughts on “Don’t worry, the post will be coming . . . eventually”

Leave a Reply Cancel reply