Christian Bartels send along this paper, which he described as an attempt to use informative priors for frequentist test statistics.

I replied:

I’ve not tried to follow the details but this reminds me of our paper on posterior predictive checks. People think of this as very Bayesian but my original idea when doing this research was to construct frequentist tests using Bayesian averaging in order to get p-values. This was motivated by a degrees-of-freedom-correction problem where the model had nonlinear constraints and so one could not simply do a classical correction based on counting parameters.

To which Bartels wrote:

Your work starts from the same point as mine, existing frequentist tests may be inadequate for the problem of interest. Your work ends also, where I would like to end, performing tests via integration over (i.e.,sampling of) paremeters and future observation using likelihood and prior.

In addition, I try to anchor the approach in decision theory (as referenced in my write up). Perhaps this is too ambitious, we’ll see.

Results so far, using the language of your publication:

– The posterior distribution p(theta|y) is a good choice for the deviance D(y,theta). It gives optimal confidence intervals/sets in the sense proposed by Schafer, C.M. and Stark, P.B., 2009. Constructing confidence regions of optimal expected size. Journal of the American Statistical Association, 104(487), pp.1080-1089.

– Using informative priors for the deviance D(y,theta)=p(theta|y) may improve the quality of decisions, e.g., may improve thenpower of tests.

– For the marginalization, I find it difficult to strike the balance between proposing something that can be argued/shown to give optimal tests, and something that can be calculate with availabe computational resources. I hope to end up with something like one of the variants shown in your Figure 1.I noted that you distinguish test statistics from deviances that do depend or do not depend on the parameter. I’m not aware of anything that prevents you from using deviances with a dependence on parameters for frequentist tests – it is just inconvenient, if you are after generic, closed form solutions for tests. I did not make this differentiation, and refer to tests independent on whether they depend on the parameters or not.

I don’t really have anything more to say here, as I have not thought about these issues for awhile. But I thought Bartels’s paper and this discussion might interest some of you.

There’s also this paper: https://arxiv.org/abs/1504.02935

Optimal Multiple Testing Under a Gaussian Prior on the Effect Sizes

Edgar Dobriban, Kristen Fortney, Stuart K. Kim, Art B. Owen

We develop a new method for frequentist multiple testing with Bayesian prior information. Our procedure finds a new set of optimal p-value weights called the Bayes weights. Prior information is relevant to many multiple testing problems. Existing methods assume fixed, known effect sizes available from previous studies. However, the case of uncertain information is usually the norm. For a Gaussian prior on effect sizes, we show that finding the optimal weights is a non-convex problem. Despite the non-convexity, we give an efficient algorithm that solves this problem nearly exactly. We show that our method can discover new loci in genome-wide association studies. On several data sets it compares favorably to other methods. Open source code is available.

Thanks for pointing out this work. It seems that we have now two fundamentally different approaches to use prior on parameters to increase the power of frequentist tests.Dobriban (2015) uses the prior to define an expected power, which is then optimized by adjusting weights for multiple testing. The above proposal argues that using a Bayesian posterior is a optimal test statistics, and that using informative priors can be an advantage in that they increase the power of the resulting frequentist test.

I like Box’s advice cited by Dobriban to be Bayesian when predicting but frequentist when testing. The proposal above goes one step further, and suggests to be Bayesian when simulating and testing but frequentist when controlling the errors of your tests.

Loooking forward to see how this evolves!

There are all kinds of prior information, much of which doesn’t come as prior distribution and isn’t easily translated into such a distribution. Often but not always, frequentist testers can make use of all or some some such information just by intelligent decision about what exact inference problem to solve, without use of priors (although I’m happy to acknowledge that Bayesian priors sometimes help).

When I started to read the paper, I was waiting for a practical example to see whether I could argue in some way that the relevant information can actually be incorporated in other ways, but no such practical example is there. Pity! In any case, “incorporating prior information” cannot be identified with “using a Bayesian prior”.

Thanks for providing more context. As to “no such practical example”: There remains more to do. The write-up exposes an interesting, new idea and shows that it could have some advantages. More practical examples remain to be done. Perhaps this could be an opportunity for someone in California to start a career, see also, Bartels, 1990, Biological Mass Spectrometry 19 (6), 363-368, and work that builds up on it, :)