## Priors for counts in binomial and multinomial models

Frank Tuyl writes:

I was hoping to get your opinion on something. I’ve always been a fan of E.T. Jaynes, so really appreciate what Larry Bretthorst did to get his book out there posthumously.

But are you aware of this paper of his? A friend and colleague of mine is getting right into it and was a bit hurt when I said it all looked a bit frequentist to me (e.g., Table 1.). I guess I’ve never felt the “need” to put point masses on parameters in the continuous domain, although I could be accused of having done this in a multinomial context.

Anyway, I was just wondering what you think about the Bretthorst paper! (Or even intuitively from a quick glance – I haven’t given it much more than that at this stage…) I always thought that the Behrens-Fisher solution was exactly what we get after transforming to delta = mu_1 – mu_2 and integrating out the sigmas (without worrying about them being equal or not!), but Bretthorst seems to consider it to be a different special case.

I agree that Bretthorst is trying to solve a non-problem here. This sort of thing happens a lot in statistics!

Tuyl followed up:

I just came across an approach to the multinomial involving a Dirichlet prior with small parameters (p.464), as an approximation of zero parameters. I’m somewhat surprised by this, given your observation on p.35 that for the binomial this would imply subtracting one success and one failure from the likelihood – a point I referred to in a 2017 TAS article on priors for the multinomial. At the end of that section you say “A thorough analysis should explore the sensitivity of conclusions to this choice of prior distribution”, and I suggest that such sensitivity would indeed be found for the zero count categoriesJ.

As you may know, Jim Berger and others have recommended parameters equal to 1/c (Perks 1947), the inadequacy of which that 2017 article meant to address, but later I had to admit to Jim that the uniform prior doesn’t do so well on the example he provided: x1 = n = 100 with c = 1000. I’d like to think I finally cracked it in a follow up 2019 article.

I agree these are real problems, and they arise in the binomial model as well. I guess it makes sense that there’s no single prior that will always work, just as there’s no single data model that will always work!

1. Carlos Ungil says:

> I always thought that the Behrens-Fisher solution was exactly what we get after transforming to delta = mu_1 – mu_2 and integrating out the sigmas (without worrying about them being equal or not!), but Bretthorst seems to consider it to be a different special case.

I’m not sure I understand what’s the issue he points to. Assuming that the variances are equal the problem is simpler and the solution is different than when that assumption is not made. With different assumptions we look at different cases.

I wouldn’t say that Table 1 looks frequentist. It sets the notation for the different things we condition on or calculte probabilities for.

> I agree that Bretthorst is trying to solve a non-problem here.

What does that mean?

By the way, the second link has an extra ) at the end.

• Andrew says:

Carlos:

When I say it’s a non-problem I just mean you can solve the problem directly using Bayesian inference. There’s a while literature on the so-called Behrens-Fisher problem and I think there’s no there there. Also, thanks for pointing out the problem with the link.

• Carlos Ungil says:

I would say that he solves, using Bayesian inference, a slightly different (set of) problem(s). One of the pieces is the so-called Behrens-Fisher problem which he also solves directly using Bayesian inference (his solution is slightly different from the standard Bayesian solution because he uses proper priors).

Where can you find the best CBD products? CBD gummies made with vegan ingredients and CBD oils that are lab tested and 100% organic? Click here.