Multiple Comparisons in Linear Regression

Hamdan Yousuf writes:

I was reading your Kanazawa letter to the editor and I was interested in your discussion of multiple comparisons. This might be an elementary issue but I don’t quite understand when the issue of multiple comparisons arises, in general. To give an example from research I have been involved in, assume I am trying to fit a linear regression on a response variable (PR: placebo responsivity score, continuous, experimentally measured) and am assessing 100-200 potential predictors (mostly scores on psychological scales.) The predictors are highly multicollinear such that it is difficult to build a “model” using more than 1 of them, and the matter simplifies to picking the single predictor that optimally explains variance in my response variable. Note that my number of observations (subjects) is small, about 40.

Is this considered a situation with multiple comparisons? That is, I am simultaneously looking at p-values for correlation between my response and each potential predictor. In practice, a handful of the variables yield very good p-values (.001-.005), and these variables make sense scientifically. However, should I be using a correction for MCs, say Bonferonni, with p=.05/200=.0005, in which case nothing is significant. Or am I misinterpreting the idea of multiple comparisons to begin with?

My reply: No, I don’t think you should be using classical multiple comparisons methods in your problem. See here and here for further discussion. For your example, maybe it would make sense to combine a bunch of your potential predictors into a single combined scale. I’m guessing that your real question is not, “Are any of these 200 potential predictors correlated with the outcome in the population,” but rather “How good are these predictors?” I think you’d be better off with a multilevel model in which you handle the uncertainty using partial pooling.

1 thought on “Multiple Comparisons in Linear Regression

  1. Multicolinearity and multiple testing is a common problem in neuroimaging analysis, – there are many correlated voxels and we test each voxel 'independently'. Bonferonni correction are rarely used in this setting. One way multiple correction can be done both in the case of Yousuf's data (well – probably) and in neuroimaging is with maximum permutation testing using the randomization distribution of the maximal statistics. I became aware of this method through Andrew P. Holmes' work, see, e.g., "Nonparametric permutation tests for functional neuroimaging: A primer with examples (2002)" or chapter 6 in his PhD thesis

Comments are closed.