
The quick way of saying this is that using a mathematical model informed by background information to set a prior distribution for logistic regression is no more “subjective” than deciding to run a logistic regression in the first place.
Here’s a longer version:
Every once in awhile you get people saying that Bayesian statistics is subjective bla bla bla, so every once in awhile it’s worth reminding people of my 2017 article with Christian Hennig, Beyond subjective and objective in statistics. Lots of good discussion there too. Here’s our abstract:
Decisions in statistical data analysis are often justified, criticized or avoided by using concepts of objectivity and subjectivity. We argue that the words ‘objective’ and ‘subjective’ in statistics discourse are used in a mostly unhelpful way, and we propose to replace each of them with broader collections of attributes, with objectivity replaced by transparency, consensus, impartiality and correspondence to observable reality, and subjectivity replaced by awareness of multiple perspectives and context dependence. Together with stability, these make up a collection of virtues that we think is helpful in discussions of statistical foundations and practice.
The advantage of these reformulations is that the replacement terms do not oppose each other and that they give more specific guidance about what statistical science strives to achieve. Instead of debating over whether a given statistical method is subjective or objective (or normatively debating the relative merits of subjectivity and objectivity in statistical practice), we can recognize desirable attributes such as transparency and acknowledgement of multiple perspectives as complementary goals. We demonstrate the implications of our proposal with recent applied examples from pharmacology, election polling and socio-economic stratification. The aim of the paper is to push users and developers of statistical methods towards more effective use of diverse sources of information and more open acknowledgement of assumptions and goals.
An informative prior is basically an assumption about the population the data set is drawn from. So are the assumption of normality and iid. If you try to avoid making the assumption of an informative prior, then you are in effect making the assumption that the prior is flat. Or you are making the assumption the the data of interest has the same properties as some previous data that was analyzed.
It’s hard to set up a problem where you can avoid making assumptions!
And using prior predictive checks! Simple and easy way to visualize your modeling choices and supposed initial state of knowledge. I’ve found them especially useful for hierarchical models, Richard’s got a great discussion of these throughout Statistical Rethinking.
I agree that in the sense that ‘subjective’ seems to be used here, the use of an informative prior can be no more ‘subjective’ than other modeling decisions a researcher might make… like the choice of likelihood.
Isn’t this point illustrated by the acceptance in frequentist research of likelihoods that are themselves based on distributions that can also be characterized in a hierarchical way, e.g. continuous mixtures like the Beta-Binomial?
I wonder if some of the perception that informative priors add to some sort of undesirable ‘subjectivity’ has to with some overloading of the word ‘subjective’ in the context of Bayesian statistics.
In other contexts, ‘subjective Bayes’ is about ‘personal probabilities’ and being unapologetic that these don’t have to relate to frequencies. If those principles are being used to specify ‘subjective priors’ then I can see why this would be objectionable to people who require frequentist properties in their estimators and predictors.
So there may just be confusion as to what using a ‘subjective prior’ implies about what the posterior then represents… is it one researcher’s personal posterior? Many users of subjective priors are also seeking frequentist properties in their posterior, and would say ‘no’ the posterior is not a ‘personal’ one,
I also agree with the sentiments about so called objective priors:
(1) that there is no such thing.
(2) to a Bayesian, using an objective prior or no prior at all induces a prior that is very likely to be less good than the one a researcher would use if they put in a little bit of time to think about how to encode what is generally already known about the model’s parameters.
Oops. I wanted to be more clear in my agreement that the words ‘subjective’ and ‘objective’ are problematic.
So i edited the end of my comment:
“ Many users of *informative* priors are also seeking frequentist properties in their posterior, and would say ‘no’ the posterior is not a ‘personal’ one,
I also agree with the sentiments about so called *noninformative* priors:
(1) that there is no such thing.
(2) to a Bayesian, using a *noninformative* prior or no prior at all induces a prior that is very likely to be less good than the one a researcher would use if they put in a little bit of time to think about how to encode what is generally already known about the model’s parameters.