The causal inference competition you’ve all been waiting for!

Posted on April 21, 2016 7:51 PM by Andrew

Jennifer Hill announces “the first-ever ACIC causal inference data analysis competition”:

Is your SATT where it’s at?

Participate by submitting treatment effect estimates across a range of datasets OR by submitting a function (in any of a variety of programming languages) that will take input (covariate, treatment assignment, and response) and generate a treatment effect estimate and confidence interval. Time is short so look now and enter soon!!! Your pride in your favorite method is on the line. Winners will be announced at the conference but you do not need to register to participate.

15 thoughts on “The causal inference competition you’ve all been waiting for!”

Aki Vehtari on April 22, 2016 3:03 AM at 3:03 am said:

I thought about participating, but unfortunately my favorite method does not produce confidence intervals

Reply ↓
Daniel Lakeland on April 22, 2016 10:06 AM at 10:06 am said:

Is this seriously proposing a contest for automatic discovery of a *causal* model sans *any* domain specific knowledge?

Reply ↓
- Guest on April 22, 2016 10:22 AM at 10:22 am said:
  
  +1
  
  Reply ↓
- David J. Harris on April 22, 2016 10:49 AM at 10:49 am said:
  
  Not the full causal model, just one parameter (for the SATT part of the contest).
  
  And we do know that there was an intervention whose targets were (more or less) randomized.
  
  Reply ↓
  - Daniel Lakeland on April 22, 2016 11:45 AM at 11:45 am said:
    
    So, one group of baseball batters is pitched a “standard” ball and one group a ball that has a different core material, and we see a change in distance hit but we don’t know whether the core material is softer or harder or has a higher or lower level of energy conservation, or whether there is also a change in the cover material that might alter the friction on the ball thereby causing a different spin characteristics and different aerodynamic lift effect, and by the way we don’t even know that we’re actually talking about batters hitting a baseball.
    
    pretty much?
    
    Reply ↓
    - David J. Harris on April 22, 2016 12:10 PM at 12:10 pm said:
      
      Right. Just like, when Fisher compared the effects of different agricultural methods, his ANOVA models didn’t incorporate any information about plant physiology or soil biochemistry, or the fact that the response variable involved crop yields.
    - Keith O'Rourke on April 22, 2016 12:40 PM at 12:40 pm said:
      
      References please?
    - David J. Harris on April 22, 2016 1:01 PM at 1:01 pm said:
      
      Edsen and Fisher’s “Studies in crop variation: VI. Experiments on the response of the potato to potash and nitrogen.” PDF available from https://digital.library.adelaide.edu.au/dspace/bitstream/2440/15203/1/78.pdf
      
      Although I’m sure the authors knew quite a bit about potatoes, soil, fertilizers, etc., the statistical models themselves all boil down to “plots with this treatment combination had more potatoes than plots with that treatment combination”.
    - Daniel Lakeland on April 22, 2016 1:14 PM at 1:14 pm said:
      
      Funny how the authors note on the front page pretty much says “this analysis is no longer useful”
      
      Really, the results depend on the “all else equal” fallacy. Results are not going to hold for potatoes grown in different soils, using water with different trace minerals, in a different part of the world, with a significantly different variety, under different climate conditions, with different harvesting techniques, and different growing seasons with prevalence of different pests and different diseases, and different insolation… etc
      
      which is what I mean below by “all the relevant unknowns Z”
      
      So, if someone wants to go back in time to 1929 and grow old varieties of potatoes in the experimental fields in Harpenden, Hertfordshire England. they’ll be all set.
    - Daniel Lakeland on April 22, 2016 12:50 PM at 12:50 pm said:
      
      I think this gets at a real dichotomy in statistics. We have something like A vs B:
      
      A) “Given a scientific causal model of how things happen, some more or less approximate knowledge of the values of the unknown portions of your model, and some data, discover what you should think about the unknown values”
      
      in which case this kind of contest is sort of meaningless, there are no scientific facts, and because of that there are no unknown parameters to estimate!
      
      B) But, there is a whole different “kind” of statistics in which the goal is really something like “Given some data, find an approximate relationship between x and y that is predictive for y in the case when you have control over x, on average, for cases where all the relevant unknowns Z are not too far from the unknown Z values that happened to exist in the population we studied”
      
      And I think there’s a lot to be said for how doing B when we should really be doing A is basically at fault for most of the concerns about bad science we’ve had on this blog, like Cancer Research, and Psychology, and soforth being broken.
      
      I understand that B can be useful, and have good economic benefits, and help people sell soap and make more food, but we really need some kind of distinctive names for these two very different things.
      
      I mean consider this:
      
      http://statmodeling.stat.columbia.edu/2012/01/28/the-last-word-on-the-canadian-lynx-series/
    - Anders Huitfeldt on April 22, 2016 1:01 PM at 1:01 pm said:
      
      I fully endorse Daniel’s comment above . This comment corresponds very closely to my own analysis of what is causing the replicability crisis in non-experimental sciences. Before I noticed this discussion, I made very similar points in an e-mail to the ACIC organizing committee; I consider Daniel’s comment to be a much more eloquent summary of the points I was trying to make.
    - Roy on April 22, 2016 3:57 PM at 3:57 pm said:
      
      This dichotomy has been made before, and if I recall correctly there is a paper out there due to some combination of Friedman, Hastie and Breiman arguing for statistics to have much more of B). The only paper I can find after a short search is http://statweb.stanford.edu/~jhf/ftp/dm-stat.pdf, but I think they also argue this in the intro of their book.
      
      I am much more in the A). camp, and am more interested in learning and understanding why things have happened, but also recognize there exist a large and growing body of work that takes the B) approach. People have also recognized that algorithms developed in one approach are essentially the same as those in the other approach (though with different names), which can lead to improvements to both.
    - Dustin on April 22, 2016 4:01 PM at 4:01 pm said:
      
      Hi, as Keith mentioned, it can be useful to have general-purpose methods that nonparametrically fit data and spit out results, without “making heroic assumptions [about the model]”. See Jennifer Hill’s papers on BART; Athey and Imbens on causal trees; Wagner and Athey on random forests; etc. I personally don’t know how to incorporate nonparametric methods in my data analyses but I appreciate the effort of reducing the number of forking paths, if you will.
    - Daniel Lakeland on April 22, 2016 5:29 PM at 5:29 pm said:
      
      As I said, “B can be useful”. In fact, although it’s not catchy terminology, I’d say A is the application of statistics to science, and B is the application of statistics to substitute for science in engineering.
      
      Sometimes we don’t have much science, so we can’t do A, but we still have an Engineering problem to solve.
      
      The problem I think comes when people who are supposed to be doing science (that is discovering why things happen) mistake type B for doing science. (ie. menstrual cycle and voting, beautiful kids and daughters, treatment Q for malady X produces score Z on some pain scale, p < 0.05 etc etc) They're all basically replacing mechanistic but imperfect reasons why things happen with associations between measured variables. They're all sometimes famously bad at generalizing out of sample.
      
      With a scientific theory, being bad at generalizing out of sample is something you learn from, with an engineering problem of type B you just take big monetary losses.
      
      Sometimes you never need to generalize out of sample. A few dozen measurements of a fluid pressure volume density relationship with a regression surface will work perfectly fine for designing a heat pump provided your pump operates in that regime. And if you really are going to re-plant the same types of potatoes in the same fields next year, it can be useful to know what treatment to apply to approximately maximize yields…
      
      The whole "Big Data" hype is mostly about "doing B can make us a lot of money". which is fine for a company selling advertising, or electronic gadgets, or even designing car bodies for aerodynamics while keeping wind tunnel modeling costs down, but not so good for actually learning how particular genes are involved in some pathway that regulates a disease process.
- Keith O'Rourke on April 22, 2016 12:47 PM at 12:47 pm said:
  
  Thought that was strange but then this seemed to explain it “,there is increasing interest in developing methods for causal inference that are highly automated to decrease the burden on applied researchers, and yet produce accurate, precise, and reproducible estimates.”
  
  I do think that is mistaken as an approach to answering causal questions but it is an approach.
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

The causal inference competition you’ve all been waiting for!

15 thoughts on “The causal inference competition you’ve all been waiting for!”

Leave a Reply Cancel reply