Software to sow doubts as you meta-analyze

This is Jessica. Alex Kale, Sarah Lee, TJ Goan, Beth Tipton, and I write,

Scientists often use meta-analysis to characterize the impact of an intervention on some outcome of interest across a body of literature. However, threats to the utility and validity of meta-analytic estimates arise when scientists average over potentially important variations in context like different research designs. Uncertainty about quality and commensurability of evidence casts doubt on results from meta-analysis, yet existing software tools for meta-analysis do not necessarily emphasize addressing these concerns in their workflows. We present MetaExplorer, a prototype system for meta-analysis that we developed using iterative design with meta-analysis experts to provide a guided process for eliciting assessments of uncertainty and reasoning about how to incorporate them during statistical inference. Our qualitative evaluation of MetaExplorer with experienced meta-analysts shows that imposing a structured workflow both elevates the perceived importance of epistemic concerns and presents opportunities for tools to engage users in dialogue around goals and standards for evidence aggregation.

One way to think about good interface design is that we want to reduce sources of the “friction” like the cognitive effort users have to exert when they go to do some task; in other words minimize the so-called gulf of execution. But then there are tasks like meta-analysis where being on auto-pilot can result in misleading results. We don’t necessarily want to create tools that encourage certain mindsets, like when users get overzealous about suppressing sources of heterogeneity across studies in order to get some average that they can interpret as the ‘true’ fixed effect. So what do you do instead? One option is to create a tool that undermines the analyst’s attempts to combine disparate sources of evidence every chance it gets. 

This is essentially the philosophy behind MetaExplorer. This project started when I was approached by an AI firm pursuing a contract with the Navy, where systematic review and meta-analysis are used to make recommendations to higher-ups about training protocols or other interventions that could be adopted. Five years later, a project that I had naively figured would take a year (this was my first time collaborating with a government agency) culminated in a tool that differs from other software out there primarily in its heavy emphasis on sources of heterogeneity and uncertainty. It guides the user through making their goals explicit, like what the target context they care about is; extracting effect estimates and supporting information from a set of studies; identifying characteristics of the studied populations and analysis approaches; and noting concerns about assymmetries, flaws in analysis, or mismatch between the studied and target context. These sources of epistemic uncertainty get propagated to a forest plot view where the analyst can see how an estimate varies as studies are regrouped or omitted. It’s limited to small meta-analyses of controlled experiments, and we have various ideas based on our interviews of meta-analysts that could improve its value for training and collaboration. But maybe some of the ideas will be useful either to those doing meta-analysis or building software. Codebase is here.

8 thoughts on “Software to sow doubts as you meta-analyze

  1. Interesting. This reminds me of the idea of increasing transparency through a multiverse analysis and also of the problems that Uri Simonson and others have pointed out regarding robustness checks.

    It seems that a big issue here is the purpose of the analysis. If a “robustness check” is done with the purpose of quieting reviewers and responding to objections to a claim, then often it seems that the robustness check does its job all too well, reifying errors and leading researchers to overconfidence in claims that are sometimes implausible as well as not being supported by the data (see for example the paper discussed here. In contrast, if the goal is to explore uncertainty, or to see how things went wrong in an analysis, these sorts of explorations can be really helpful.

    • Yes this is stressing sensitivity analysis in meta-analysis and multiplexing to get a sense of the implications whenever you reach a decision about study inclusion that’s not obvious. I guess if you are hell bent on showing that a sensitivity analysis doesn’t threaten your claims, this tool might help you explore more possible combinations of groupings that you can posthoc rationalize. Though the hope is that everyone is at some point at least a little susceptible to the doubts of a “grouchy reviewer”, which one of the meta-analysts we interviewed likened this to.

      I remember the asynchronous collaboration paper. Lots of tools could be made for that purpose rather than it being an afterthought (like causal quartets, which I think are great for criticism).

      This is the first of two papers we just published related to multiverse, will post something on the other soon.

  2. This seems such a smart way to enhance and improve meta-analyses.

    I couldn’t quite understand Andrew’s comment and your response, so maybe you’ve addressed this…but is there an issue where there’s a lack of transparency if this approach is a bit of a black box where discrete elements go in and an amalgamated result comes out, without the breakdown of the sensitivity analysis apparent? Does that question even make any sense?

    • Yes, it makes sense as a critique of meta-analysis, where there might be a lot of uncertainty suppression or reduction on the analyst’s part that the consumer of the final results never sees. In the spirit of Andrew’s comment about how it often depends on the researchers’ goals whether the sensitivity analysis is actually helpful at surfacing uncertainty, there’s also a risk that some researcher explores a bunch of different potential meta-analytic estimates but only reports one that aligns with their preferred interpretation of the evidence on that topic. Though I don’t think this particular tool is really making that easier, since the main design pattern is not letting the analyst forget about concerns they flagged earlier in the review process (all of that uncertainty gets propogated so that they have to grapple with it again with the final estimates).

      • “…there’s also a risk that some researcher explores a bunch of different potential meta-analytic estimates but only reports one that aligns with their preferred interpretation of the evidence on that topic.”

        Exploratory meta-analysis. When I put it that way it doesn’t sound so bad, amirite?

        I thought of this potential for misuse as well, but then I started flipping it around and thinking that a tool like this could help a scrupulous researcher find patterns in the data that would be otherwise hard to spot.

        • Right, rather than a “risk” it is a key part of science that should be celebrated and rewarded.

          The error is treating post-dictions the same as pre-dictions, not in guessing explanations for what we observe.

Leave a Reply

Your email address will not be published. Required fields are marked *