Skip to content

An R package for multiverse analysis and counting researcher degrees of freedom

Joachim Gassen writes:

In a recent blog post I introduce an in-development R package that helps researchers to identify, document and exhaust inherent research design choices in work based on observational data.

As the analysis that I propose is similar in notion to a multiverse analysis that you suggested, I thought that maybe the package and the blog article might be of interest to you.

I haven’t had a chance to look at all of this but I’m posting it here as it might be of interest to you.

We’ve had lots of discussion of problems with statistical methods, so it’s good to see people developing methods to address some of these concerns.


  1. Thanks! Author here. The package is in-development. Code is on GitHub. Feel free to reach out for feedback. Joachim

    • Adam Schwartz says:

      This is really great! I’d been thinking about the same thing for a while, and love that someone implemented it. Have you thought about how coding choices impact the choice of technique used and how you might combine that information? For example, rather than use a continuous variable, maybe I’d convert the independent and dependent variables into factors and switch from Pearson correlation to ChiSq. I’m inferring (perhaps improperly) from your post that the technique used is fixed while how variables get encoded is what you specify to create the analysis of RDF. Although I guess you could argue that shows that all tests are really linear models of some form and so perhaps a fixed model is fine.

      • Thank you! I tried to design the package to be as agnostic as possible about underlying techniques. The structure that it imposes is that each research design has to be implemented by a functional chain with each functional step taking the output of the prior step as input, applying a list of continuous and/or discrete choices and generating output that serves as input to the next step. The final step produces output that is reported as the “result”.

        So, you are free to use linear or categorial models or whatever you are in interested in. From the standpoint of the code, this is just another choice.

        Does that make sense?

Leave a Reply