Correcting for multiple comparisons in a Bayesian regression model

Joe Northrup writes:

I have a question about correcting for multiple comparisons in a Bayesian regression model. I believe I understand the argument in your 2012 paper in Journal of Research on Educational Effectiveness that when you have a hierarchical model there is shrinkage of estimates towards the group-level mean and thus there is no need to add any additional penalty to correct for multiple comparisons. In my case I do not have hierarchically structured data—i.e. I have only 1 observation per group but have a categorical variable with a large number of categories. Thus, I am fitting a simple multiple regression in a Bayesian framework. Would putting a strong, mean 0, multivariate normal prior on the betas in this model accomplish the same sort of shrinkage (it seems to me that it would) and do you believe this is a valid way to address criticism of multiple comparisons in this setting?

My reply: Yes, I think this makes sense. One way to address concerns of multiple comparisons is to do a simulation study conditional on some reasonable assumptions about the true effects, then see how often you end up obtaining statistically-significant but wrong claims. Or, if you want to put in even more effort, you could do several simulation studies, demonstrating that if the true effects are concentrated near zero but you assume a weak prior, that then the multiple comparisons issue would arise. That would be an interesting research direction, actually, to study how the multiple comparisons problems gradually arise as the prior becomes weaker and weaker. Such an analysis could aid our understanding by bridging between the classical and fully-informative Bayesian perspectives on multiple comparisons.

5 thoughts on “Correcting for multiple comparisons in a Bayesian regression model

  1. I’m not sure why the fact that the data aren’t hierarchically structured prevents shrinking the betas towards their common mean. It seems to me that the mean of the prior can be given a hyperprior instead of being fixed to zero. (Naturally this only makes sense if the betas are expected to have similar magnitudes.) Am I missing something?

    • I think the author basically figured this out but wanted some kind of confirmation. It does make sense to me. I even have worked recently on a biological model where we expected big effects for a few genes and no effect for others. We wound up using a t or cauchy distribution for those priors, thereby either shrinking them towards zero, or allowing them to be outliers. It worked nicely.

      • One of the keys here was that we *didn’t* use “uninformative” priors. We placed a scale on the cauchy/t distributions that was itself exponentially distributed with mean 0.1 or something like that, We knew from biological principles that some things just weren’t going to be affected by the treatment, and we weren’t ashamed to place that knowledge in the prior. It’s a lot easier if you have a physical/causal theory you can incorporate into your priors.

Comments are closed.