Do we still recommend average predictive comparisons? Click here to find the surprising answer!

Posted on December 29, 2019 5:59 PM by Andrew

Usually these posts are on 6-month delay but this one’s so quick I thought I’d just post it now . . .

Daniel Habermann writes:

Do you still like/recommend average predictive comparisons as described in your paper with Iain Pardoe?

I [Habermann] find them particularly useful for summarizing logistic regression models.

My reply: Yes, I do still recommend this!

9 thoughts on “Do we still recommend average predictive comparisons? Click here to find the surprising answer!”

BozoBrain on December 31, 2019 9:43 PM at 9:43 pm said:

Are these essentially the Average Marginal Effects which are obtained from the “margins” command in Stata (and the newish margins command in R)? Note that these quantities can be obtained from post-processing MCMC (e.g.Stan) output.

Reply ↓
- Andrew on December 31, 2019 11:40 PM at 11:40 pm said:
  
  Bozo:
  
  I don’t know what Stata etc. do, but the average predictive comparisons in our paper do not in general correspond to effects. They can be causal effects under certain assumptions, but comparisons is a more general term.
  
  Reply ↓
  - BozoBrain on January 1, 2020 12:03 AM at 12:03 am said:
    
    Andrew,
    
    Thank you. I didn’t mean to wade into the muddy waters of causal inference. However, the approach of averaging over empirical distributions of predicted values is really catching on in the Stata (and now to some extent R) world. For a Stan example, see the first answer to such a question here:
    
    https://stackoverflow.com/questions/45037485/calculating-marginal-effects-in-binomial-logit-using-rstanarm
    
    Frankly, Stan and other MCMC software seems to be an ideal platform to compute these quantities without resorting to normal asymptotic theory which can lead to biased inference.
    
    Reply ↓
    - Daniel Lakeland on January 1, 2020 2:00 AM at 2:00 am said:
      
      > averaging over empirical distributions of predicted values
      
      I think this means averaging over *posterior* distributions of predicted values. Empirical distributions are distributions of observed quantities, but predictions are not observed, except in the trivial sense that when you ask a computer to calculate something you can then observe what it calculated.
      
      The ultimate point of that bit of pedanticness is that in order to have a distribution to average over, you must have a posterior distribution, in other words, you must be doing Bayesian analysis.
    - Carlos Ungil on January 1, 2020 6:22 AM at 6:22 am said:
      
      > In order to have a distribution to average over, you must have a posterior distribution, in other words, you must be doing Bayesian analysis.
      
      If you have a model y=f(x1, x2, x3, …) that gets you predicted values for a given set of covariates, to get average predicted values you just need a distribution of covariates. It doesn’t have to be « Bayesian », it could come straight from the census.
    - Daniel Lakeland on January 1, 2020 10:33 AM at 10:33 am said:
      
      Sure, yes you can always average across a population too. I wasn’t clear that was what was being suggested. I guess there are two modes of analysis here. One you can average a predicted change across the population. You can do this regardless of your modeling method. It gives you some kind of population information, a single number, and two you can average a predicted change across a posterior, and get a predicted change for each unit, it gives you a population of expected changes.
    - Andrew on January 1, 2020 10:35 AM at 10:35 am said:
      
      Yes, in our paper we average over the posterior and also we average over the data, which represents an average over a hypothetical superpopulation.
    - BozoBrain on January 1, 2020 4:03 PM at 4:03 pm said:
      
      One useful application is with simple Gaussian regression with interaction terms:
      y = beta_0 + beta_1*x_1 + beta_2*x_2 + beta_3*x_1:x_2
      
      Here one might be interested in the marginal “effect” (not necessarily causal) of x_1. But what about the interaction with x_2? One could plot various dose-response type curves of x_1 versus y for various values of x_2. But the margins approach allows for a single x_1 versus y dose-response graph.
      
      The trick is this. Suppose there are individuals i=1,…,N. Then for a particular value of x_1, predict y for x_2_{i=1}, x_2_{i=2), etc. Then average the predicted values with error. This allows for examination of the effect of x_1 on y, averaging over the empirical values of x_2 in the data. This can be done in a Bayesian for frequentist framework.
      
      There are many variations of this, and is quite useful for non-linear (e.g. logit) models. It appears to have become very very popular and many of us would be giddy if rstanarm or brms would support a full-featured “margins” like command. Many researchers use Stata largely because of this feature.
    - Andrew on January 1, 2020 4:36 PM at 4:36 pm said:
      
      Bozobrain:
      
      OK, I put in a request: https://discourse.mc-stan.org/t/request-for-average-predictive-comparisons-in-rstanarm-and-brms/12477

Statistical Modeling, Causal Inference, and Social Science

Do we still recommend average predictive comparisons? Click here to find the surprising answer!

9 thoughts on “Do we still recommend average predictive comparisons? Click here to find the surprising answer!”

Leave a Reply to Daniel Lakeland Cancel reply