Yes, but you are specifically worried about the situation of weak n data no? The safeguard here *is* the prior – at least an explicitly specified one. One goal for prior specification is regularization, which is almost always a good thing in order to avoid being jerked around by noisy, small-sample data.

“I agree, but also believe that priors with hyperparameters can add even more subjectivity, more levers to pull to ‘hack’”

Depends on how you look at it. The maximum likelihood estimate is (more or less, usually) equivalent to a Bayesian MAP with a flat prior. What that implies is that all values of theta are equally plausible, apart from the information in your small (potentially noisy) sample. This is a disastrous assumption in many cases.

In some sense, this whole discussion about Bayes is like the trolley problem in moral philosophy. Choosing *NOT* to pull the lever is still a choice, and the consequences can be very bad.

]]>Because, IMO, actual data from a well-designed experiment is more important than prior attitudes/beliefs that can dramatically affect an outcome.

“I think it is important to remember that any statistical modeling–be it frequentist or Bayesian–is contingent upon layers of subjective choices,…”

I agree, but also believe that priors with hyperparameters can add even more subjectivity, more levers to pull to ‘hack’,

Justin

http://www.statisticool.com

p.s. in clinical research simulated patients are actors pretending to be real patients – truly fake patients.

]]>I wonder if Andrew’s just trying to make it super clear to everyone it’s not “real” data.

]]>For example, suppose we have a uniform prior on a chance of success parameter, e.g.,

$latex \theta \sim \mathsf{Uniform}(0, 1).$

Then if we transform $latex \theta$ to the log odds scale,

$latex \phi = \mathrm{logit}(\theta) = \log \frac{\theta}{1 – \theta},$

we wind up with a non-uniform prior for the log odds,

$latex \phi \sim \mathsf{Logistic}(0, 1).$

You also have to be careful with positive constrained parameters where those wide proper priors aren’t uniform. They have a tendency to pull posterior mean estimates away from zero. So the intuitions from penalized maximum likelihood (where flat priors do very little) don’t carry over to Bayesian posterior means, which try to blend the prior and the data. If the prior says the value might be 1000, that’ll be taken into account in the posterior.

For example, consider this Stan program which uses BUGS-style wide proper priors on the mean and scale parameter of a normal distribution. There are three observations, -1, 0, and +1, coded in by hand.

parameters { realsigma; real mu; } model { sigma ~ inv_gamma(1e-4, 1e-4); mu ~ normal(0, 1e4); [ -1, 0.0, 1.0 ] ~ normal(mu, sigma); }

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat sigma 1.7 0 2.1 0.5 0.8 1.2 1.8 6.1 4529 1 mu 0.0 0 1.4 -2.5 -0.5 0.0 0.5 2.4 4773 1

The posterior mean for sigma is 1.7 and the posterior 95% interval is (0.5, 6.1). The posterior median is still only 1.2. The maximum likelihood estimate (no priors) for the sd is 0.67 (divides by N); the unbiased sample standard deviation estimate is 1.0 (divides by N – 1). These are not what you get from the Bayesian posterior with a wide proper prior such as $latex \mathsf{InvGamma}(0.0001, 0.0001)$.

]]>I’d be worried with small n that Bayesian could become something like a ‘Drake Equation’ type of thing, where one can get about any posterior output based on their prior inputs…I’m not sure any method is too great for small n

Any method that accurately conveys the uncertainty is great. There is nothing wrong happening in your drake equation example.

]]>Unfortunately, we do know there is such a thing as “fake data collection” and that it differs substantially in ethics and effort from the process of “real data collection.”

]]>Let me put it this way: I’d be much more comfortable getting on a flight with a first time pilot who trained on a flight simulator rather than first-time pilot who just pretended they were flying a bunch of times.

]]>here’s a note where a simple brms option toggle lets you flip from prior to posterior predictive draws

https://ucla.box.com/v/brms-prior-post-pred

the beginning is a toy example whereas page 11 and 12 show a more relevant example

]]>We sometimes use the term “pretend data.”

]]>I believe Andrew has said in various places that the “best” statistical method is the one that best utilizes the most information. There is no magic in the Bayesian formalism from that point of view – it is just a coherent, comprehensive approach that solves a lot of problems, but still subject to GIGO like everything else… ]]>

But I don’t really get why likelihood not swamping the prior should be worrying. If there isn’t enough data to inform us, isn’t it it only natural that it shouldn’t affect our prior attitudes if that’s all we have? Of course then the inferences are contingent on the–shudder–subjective prior, but I reckon this is exactly why they should be made explicit. I’m not sure what “agreed upon” priors would be, since they depend on the situation–e.g. prior information on the specific subject- and the model specificaion/parameterization.

I think it is important to remember that any statistical modeling–be it frequentist or Bayesian–is contingent upon layers of subjective choices, and there is nothing inherently wrong with that, as long as this is recognized and made as explicit as practically possible. You could be as worried about how one can get any p-value they ever want be defining a null model that just gives them the results they want–in my view, there is nothing inherently more “objective” in the null models as they are currently used.

]]>(similar thing applies to frequentist approaches. I’m not sure any method is too great for small n)

Justin

http://www.statisticool.com

Yes, you’re right. The general idea is described here. But even simulating once from the model can give a lot of insight in many examples. So we generally recommend simulating once, as a way to understand the model and capture gross problems. You can simulate many times for more systematic checks.

]]>