Chris Guure writes:

I am trying to construct an informative prior by synthesizing or collecting some information from literature (meta-analysis) and then to apply that to a real data set (it is longitudinal data) for over 20 years follow-up.

In constructing the prior using the meta-analysis data, the issue of publication bias came up. I have tried looking to see if there is any literature on this but it seems almost all the articles on Bayesian meta-analysis do not actually account for this issue apart from one (Givens, Smith and Tweedie 1997).

My thinking was that I could assume a data augmentation approach by fitting a joint model with the assumption that the observed data are normally distributed and the unobserved studies probably exist but not included in my studies and can be thought of to be missing data (missing not at random or non-ignorable missingness). This way a Bernoulli distribution could be used to account for the missingness.

But according to Lesaffre and Lawson 2012, pp. 196; in hierarchical models, the data augmentation approach enters in a quite natural way via the latent (unobserved) random effects. This statement to me implies that my earlier idea may not be necessary and may even bias the posterior estimates.

My reply: You could certainly do this, build a model in which there are a bunch of latent unreported studies and then go from there. I don’t know how well this would work, though, for two reasons:

1. Estimating what’s missing based on the shape of the distribution—-that’s tough. Inferences will be so sensitive to all sorts of measurement and selection issues, and I’d be skeptical of whatever comes out.

2. You’re trying to adjust for unreported studies in a meta-analysis. But I’d be much more worried about choices in data processing and analysis in each of the studies you have. As I’ve written many times, I think the file-drawer problem is overrated and it’s nothing compared to the garden of forking paths.

Chris (and others) might be interested in this paper by Maime Guan and Joachim Vandekerckhove titled, “A Bayesian approach to mitigation of publication bias”. It builds off some ideas of that paper by Givens et al. and tries to mix estimates from various bias models to obtain shrunken effect sizes. Cool stuff!

http://www.cidlab.com/prints/guan2015bayesian.pdf

I would imagine in practice the proportion of non-significant results would be all but impossible to estimate. But if the idea is to use the results of the meta-analysis as a prior, any such approach should result in less certainty and so a vague prior. Wouldn’t that be OK, and maybe even desirable?

Akin to all the choices in the garden of forking paths, isn’t Chris Guure’s exercise of constructing an informative prior itself a plethora of subjective choices?

Why is one approach better than the other?

Rahul:

I think the point is to construct the model as transparently as possible. In the forking-paths papers that I’ve criticized, the trouble is that the selected comparisons are presented as if they’re the only things that could be done, which in at best naive.

Chris (and others),

If you want to build a good meta-analysis model the first thing to do is to look at the quality of the published evidence. There are tools to assess the risk of bias of RCTs and observational studies (e.g. http://handbook.cochrane.org/front_page.htm).

Then you should use your Bayesian imagination to build a suitable model for the evidence at hand.

During the last years clinical trials have been registered (e.g.http://www.isrctn.com/) so publication bias should be less relevant than 20 years ago. If a study was not published you can contact the authors to ask about results.

Hope it helps!

I don’t expect to see one that will be credible/convincing – there just too many challenges given the way studies are conducted, results reported on (written up) and selectively published.

I was first convinced of this by Iyengar, S. & Greenhouse, J. B. (1988). Selection models and the file drawer problem.

Statistical Science, 109–117 and comments.

For instance the publication selection models used in http://www.cidlab.com/prints/guan2015bayesian.pdf are just not realistic – real journal decisions are a mix reported results and politics (e.g. some groups can always get their papers published.)

But publication selection is just one of many conflated problems. Some studies are are much better conducted and reported on, that has to be dealt with somehow (Andrew’s garden of forking paths is _one_ of those quality dimensions.)

Now I actually got into meta-analysis as a means to construct informative priors and the general idea back then (1980,s) was to flatten the posterior (or likelihood given no prior) to get a conservative prior. These days we would realize it also needs to be shifted likely downwards. But by exactly how much?

As Stephen Senn used to put – meta-analysis is like aging: the only alternative is much worse (tying to assume no information?)

So it has to be done, it almost surely will be importantly wrong but hopefully much less wrong than not doing it at all.

(It also provides information on what should be done differently in new studies – see http://statmodeling.stat.columbia.edu/2012/02/12/meta-analysis-game-theory-and-incentives-to-do-replicable-research/ )

How does the garden of forking paths help? It only describes a problem but not the solution strategy, correct?

For new studies you would try to minimize the number/complexity of paths and document those that could not be so avoided.

For existing studies if some had more paths than others and you could believe the documentation of this [and there were no other major problems as there almost always is] you could down weight or model to help sort that out – more paths less weight.

(Here is a technical paper on that type of approach http://biostatistics.oxfordjournals.org/content/2/4/463.full.pdf )

If smart people can’t adequately process the studies that were done from what was reported – then nothing has been actually learned from them and that should be the inference along with informing decisions for future studies.

You can’t make chicken salad from chicken shit but you wan’t to understand when you are just getting that ingredient and how to get something better in the future.

Dear Chris,

some years ago I was thinking about a similar problem while playing around with the Copas/Shi model for publication bias in meta-analysis. I learned that there are a lot of empirical results on publication bias, time delay to publication for “negative” results etc. in clinical research from which it would be probably easy to derive informative priors. The following papers are a good way to start (but I am sure that more evidence came up in more recent years):

http://www.ncbi.nlm.nih.gov/pubmed/9450711

http://www.ncbi.nlm.nih.gov/pubmed/17443632

http://www.ncbi.nlm.nih.gov/pubmed/17443628

Hope that helps,

Oliver

Oliver:

I do like Copas/Shi work on publication bias and almost suggested it here but it only gets to the point of being a sensitive analysis rather than an adjustment primarily due to lack of confidence in model specifications.

Disclaimer: Copas was original my external thesis examiner but withdrew after the Viva for undisclosed reasons.