More on p-values etc etc etc

Posted on January 20, 2022 8:02 PM by Andrew

Deborah Mayo writes:

How should journal editors react to heated disagreements about statistical significance tests in applied fields, such as conservation science, where statistical inferences often are the basis for controversial policy decisions? They should avoid taking sides. They should also avoid obeisance to calls for author guidelines to reflect a particular statistical philosophy or standpoint. The question is how to prevent the misuse of statistical methods without selectively favoring one side.

This is from an article called, “The statistics wars and intellectual conflicts of interest.” The concept of an intellectual conflict of interest is interesting, and it’s all over statistics and its applications; I wouldn’t know where to start, and there’s definitely no place to stop once you get started on it.

Mayo got several people to comment on this article, and she put it all on her blog, for example here. She suggests we discuss it here, as she (accurately, I think) suspects that our readership would have a much different take on these issues.

The particular discussion I linked to is by John Park, who warns of “poisoned priors” in medical research. My response to this is that all parts of an analysis, including data model, prior distributions, and estimates or assumptions of costs and benefits, should be explicitly justified. Conflict of interest is a real problem no matter what, and I don’t think the solution is to use a statistical approach that throws away data. To put it another way: As Park notes, the tough problems come when data are equivocal and the correct medical decision is not clear. In that case, much will come down to assessed costs and benefits. I think it’s best to minimize conflict of interest through openness and feedback mechanisms (for example, predictive markets, which are kind of a crude idea here but at least provide a demonstration in principle that it’s possible to disincentivize statistical cheating). I mean, sure, if your data are clean enough and your variability is low enough that you can get away with simple classical approach, then go for it—why not?—but we’re talking here about the tougher calls.

I won’t go through the discussions on Mayo’s blog one by one, but, yeah, I have something to disagree with about each of them!

A lot of the discussion is about p-values, so I’ll remind everyone that I think the problems with p-values are really problems with null hypothesis significance testing and naive confirmationism. I discuss this in my article, The problems with p-values are not just with p-values, and my post, Confirmationist and falsificationist paradigms of science. The trouble is that, in practice, null hypothesis significance testing and naive confirmationism are often what p-values are used for!

There’s also a separate question about whether p-values should be “banned” or whatever. I don’t think any statistical method should be banned. I say this partly because I used to work at a statistics department where they pretty much tried to ban my methods! So I have strong feelings on that one. The flip side of not banning methods is that I should feel no obligation to believe various Freakonomics, Ted-talk crap about beauty and sex ratio or the critical positivity ratio or the latest brilliant nudge study, just cos it happens to be attached to “p less than 0.05.” Nor should anyone feel obliged to believe some foolish analysis just because it has the word “Bayes” written on it. Or anything else.

Anyway, feel free to follow the above links and draw your own conclusions.

166 thoughts on “More on p-values etc etc etc”

Anonymous on January 20, 2022 8:49 PM at 8:49 pm said:

From Philip Stark’s contribution:

“Throwing away P-values because many practitioners don’t know how to use them is perverse. It’s like banning scalpels because most people don’t know how to perform surgery. People who wish to perform surgery should be trained in the proper use of scalpels, and those who wish to use statistics should be trained in the proper use of P-values. Throwing out P-values is self-serving to statistical instruction, too: we’re making our lives easier by teaching less instead of teaching better.”

This is a commonly voiced solution. Everyone who advances it should be required to explain in detail why Frequentists didn’t teach it better during the ~100 years in which they controlled all the major journals, all the major textbooks, dominated the curriculums, controlled stat departments, dominated governmental regulations ,…..

So come on Frequentists! What is it exactly you need before you can teach p-values right?

Reply ↓
- Anonymous on January 20, 2022 9:08 PM at 9:08 pm said:
  
  Seriously, they say “teach it right” as though there was some mysterious force in the Universe preventing them. Was it Bayesian saboteurs? Is that why they couldn’t teach it right?
  
  Or do they claim that out of everything done in the 20th century, p-values are just uniquely impossible to teach?
  
  Do they have a proposed timeline by any chance? Something like “within 87.2 years p-values will be taught correctly if only we allocate $25.6 Billion to the effort”
  
  Reply ↓
  - Andrew on January 20, 2022 9:48 PM at 9:48 pm said:
    
    “Participants reported being hungrier when they walked into the café (mean = 7.38, SD = 2.20) than when they walked out [mean = 1.53, SD = 2.70, F(1, 75) = 107.68, P < 0.001]."
    
    I assume that p-value was calculated correctly. But . . .
    
    Reply ↓
    - fritos on January 21, 2022 1:42 AM at 1:42 am said:
      
      Andrew…jhfc, man, be careful, man! Being a smartass who’s right is…dangerous.
  - Anonymous on January 21, 2022 12:34 AM at 12:34 am said:
    
    Really Frequentists, please just fill in the blank for me:
    
    “Teaching P-values correctly solves the issues with p-values and would do immeasurable good for the world, but we failed to teach it correctly despite having all the time, resources, and advantages in the world, because _______”
    
    Reply ↓
    - Anonymous on January 21, 2022 12:05 PM at 12:05 pm said:
      
      you can make a lot more money selling snake-oil than you can being right.
- Paul Hayes on January 20, 2022 11:06 PM at 11:06 pm said:
  
  So come on Frequentists! What is it exactly you need before you can teach p-values right?
  
  Better students –
  
  Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature.
  
  – ones able to cope with the “high cognitive demand”.
  
  Reply ↓
  - Anoneuoid on January 20, 2022 11:24 PM at 11:24 pm said:
    
    The scientific mind (perhaps a more rhetoric-focused mind accepts it) cannot process the “logic” of NHST, because it is based on a fallacy (strawman).
    
    But they also find it hard to believe everyone does something so dumb, so they have come up with a whole bestiary of convoluted explanations for why they are spending time on it.
    
    Once you understand clearly what statistical significance means you would never use NHST and avoid anything that relied on it.
    
    Reply ↓
  - Anonymous on January 21, 2022 12:48 AM at 12:48 am said:
    
    So the same students who mastered Genetics, the Krebs cycle, Electrodynamics, and Quantum Mechanics, put man on the moon, built the stealth fighter, and cured polio, were felled by the mighty p-value?
    
    Reply ↓
    - Paul Hayes on January 21, 2022 7:38 AM at 7:38 am said:
      
      Sadly, QM – another application of probability – is in an even worse state than statistical inference.
    - Martha (Smith) on January 22, 2022 12:48 PM at 12:48 pm said:
      
      Did all of those students really “Master” genetics and Quantum Mechanics? And if they did, then maybe we need to teach them statistics by starting out discussing the inherent uncertainty in those fields, and then pointing out that the p-value is just one way of coping with that uncertainty.
      And I would guess that we really need to start teaching students about probability in middle school. (But this really applies to statistics in general — students need accept the ubiquitousness of uncertainty early on.)
    - Anonymous on January 22, 2022 1:17 PM at 1:17 pm said:
      
      Indeed QM is a bit of a mess, and it’s somewhat ambiguous to say people “master” it, yet every year people learn how to correctly calculate things with it and correctly connect those numbers to the real world – which is more than can be said for p-values evidently.
      
      The real point though is the idea that p-values are uniquely difficult to teach is absurd. Even the slightest acquaintance with the courses and curriculums in STEM fields suggests p-values should be in the top 5% of easiest things they ever learn. That it’s somehow among the most difficult subjects the human race has ever grasped suggests there are things going on here that can’t be fixed by “just teaching it right”.
    - Andrew on January 22, 2022 2:09 PM at 2:09 pm said:
      
      Anon:
      
      People to learn to successfully use p-values, not just to get papers published in Psychological Science or whatever, but also to make decisions on drug approvals, A/B testing, and all sorts of other things. These may not be the best decisions, but they are a way of connecting numbers to the real world.
      
      I agree that p-values aren’t so hard to teach. They’re a lot easier to teach than teaching Bayesian methods! I don’t teach p-values, not because they’re hard to teach, but because I think they’re typically a bad idea.
- Anonny on January 21, 2022 8:51 AM at 8:51 am said:
  
  I’m no fan of NHST, but to be fair to Frequentists, I think it’s a bit facile to say that they “dominated” science. Methodology derived from their philosophy, as practiced by many researchers with a poor understanding or appreciation of the nuances of statistics (Frequentist or otherwise), dominated science. I think this distinction is important as I highly doubt if, in an alternate universe where everyone was Bayesian from the start (set aside computational issues that made practical Bayes mostly impossible for most of the 20th century), that anything would be much different. Publication bias would still be a thing, but it would be driven by Bayes factors instead of p-values, or some other vapid metric. Poor quantitative reasoning or even outright fraud won’t automatically be fixed if we all turned Bayesian. Those are the real issues in my opinion, and that’s every statistician’s responsibility, not just Frequentists.
  
  That said, I strongly disagree that NHST would suddenly be a wonderful idea if everyone knew more about it. In fact, I suspect that a deeper understanding would drive more people away from it! I remember last year I collaborated on a project with some engineers, and they seemed genuinely shocked to learn that the whole “alpha = 0.05” thing is completely arbitrary.
  
  Reply ↓
- Mendel on January 21, 2022 3:03 PM at 3:03 pm said:
  
  Are Bayesian students taught to solve the Envelope paradox correctly?
  Any method can be abused. The more people use a method, the more abuse it.
  
  Choose the right tool for the job at hand.
  
  Reply ↓
  - Keith O'Rourke on January 21, 2022 3:09 PM at 3:09 pm said:
    
    How much agreement is there amongst Bayesian (largely undefined reference set) and how to he Envelope paradox correctly?
    
    (In an applied context, there would be finite amounts of money to put into the envelopes.)
    
    Reply ↓
  - Anonymous on January 21, 2022 8:47 PM at 8:47 pm said:
    
    It’s not about comparing Frequentists and Bayesians. The point is how is teaching p-values better going solve anything, when it hasn’t solved anything so far?
    
    If they couldn’t implement this seemingly trivial solution despite having everything they needed in abundance, then why is it suddenly going to work now?
    
    Reply ↓
  - Daniel Lakeland on January 21, 2022 9:05 PM at 9:05 pm said:
    
    Let A be the amount of money in the lesser envelope.
    There is prob 1/2 that you chose the lesser envelop with A, and 1/2 that you chose the greater envelope with 2A.
    
    The expected change in value from switching is: there is 1/2 chance you are on the envelope with A and will gain an additional A, and 1/2 chance you are on the envelope with 2A and will gain an additional -A (lose A)
    
    Expected value of switching is: -A * 1/2 + A * 1/2 = 0 so you should be indifferent to either switch or not switch.
    
    WTF is the problem?
    
    Reply ↓
    - Phil on January 22, 2022 2:03 AM at 2:03 am said:
      
      Yeah, yeah, everybody knows that’s the right answer. That’s not the hard part. The hard part is explaining why this is the wrong answer:
      You’ve chosen an envelope. It contains some amount of money, which we will call A.
      1. The other envelope is equally likely to contain 2A or A/2.
      2. Therefore the expected value of the amount in the other envelope is 0.5*(2A) + 0.5*(A/2) = 5A/4.
      3. 5A/4 > A
      4. Therefore you stand to gain by switching, on average.
      
      Please identify which line has the error.
    - Keith O'Rourke on January 22, 2022 9:00 AM at 9:00 am said:
      
      You are forgetting your prior for how much money the person putting money into two envelopes has and is willing to part with. And if you have a prior you have to check it.
      
      For instance in the money in the envelope happens to be close to the maximum of your prior, don’t switch [ with input from David Draper]
      
      (Part of the problem in a subject like statistics is people thinking/claiming they know the right answer without clarifying that that answer is just based on what they think they are assuming while not being fully aware of what they are or should.)
    - Andrew on January 22, 2022 9:10 AM at 9:10 am said:
      
      Keith:
      
      A friend told me about this two-envelope puzzle in grad school and, after some thought, I told him that the correct solution depends on the prior distribution, but a simple solution that dominates the “always switch” or “never switch” responses is the decision, “switch only if what you see is less than X,” for some pre-chosen X. This solution dominates for any X, but of course it’s best to use prior information to find an X that is most likely to fall in between the values of the two envelopes. The simple solution I gave is not necessarily optimal—for some priors, more complicated rules will perform even better, for example if your prior is that the two values will be both near $100 or both near $1000, then a better rule would be to switch if what you see is less than 100, don’t switch if it’s between 100 and 300, switch if it’s between 300 and 1000, and switch if it’s more than 1000—but it still dominates the never-switch or always-switch rules, as is trivial to show. It’s an interesting example in which a Bayesian solution can be given different sorts of frequentist analysis. Hmmm, maybe I should write this up as a post…
    - somebody on January 22, 2022 10:05 AM at 10:05 am said:
      
      There are two different situations at under discussion here:
      
      1. You pick an envelope, then without looking at the envelope, talk yourself into switching by Phil’s argument, repeat for the new envelope into an infinite loop
      2. You pick the first envelope, then look at the amount, then choose to switch or not. (Keith)
      
      Both paradoxes neglect a prior which is necessary to get a probabilistically sound computation. In Phil’s argument, you actually compute a conditional expectation E(switching | chosen envelope contains A). To compute the total expectation requires averaging over the prior, which by the symmetry of the prior over both envelopes means expected value of switching will always be zero.
      
      In the second scenario, you actually do bayesian updating. This means all bets are off, and sometimes you can get a guarantee about which envelope is the larger/smaller.
      
      You can still get into a situation with some priors where, after looking at one envelope and computing bayes, the situation is still exactly symmetric. The other envelope is equally likely to be the larger or smaller. Expectation value still tells you that you should switch. Some are confused by this and consider that to be paradoxical, but that’s now a decision theory problem. At that point, two objectives have been confounded
      
      1. choose the bigger amount/minimize regret. U(envelope) = 1 if higher 0 else
      
      2. Maximize expected value, U(envelope) = value of envelope
      
      The linear utility here is highly risk seeking; a logarithmic utility would preserve the symmetry. There’s no escaping subjective utility in problems like this, in the end. If you had gambling debts and someone was coming to kill you unless you have $x, and you got $0.75x in your first envelope, it would be totally rational to switch. If you got $1.5x, you would be insane to switch.
    - somebody on January 22, 2022 10:07 AM at 10:07 am said:
      
      but then again, at that point, you probably couldn’t help chasing the thrill
    - David Marcus on January 22, 2022 12:21 PM at 12:21 pm said:
      
      Phil wrote:
      
      > 1. The other envelope is equally likely to contain 2A or A/2.
      
      Of course, that is the error.
      
      Frequentists treat unknowns as fixed parameters, and get into all sorts of confusion as a result. That’s how they ended up with P-values and confidence intervals. They really want the probability of the parameter being somewhere, but they insist that the unknown parameter is not random. So they calculate the probability of something else that is sometimes similar to the probability that they really want.
      
      Bayesians know that you have to model all your unknowns.
      
      I used to work in geodesy. There were people who insisted you couldn’t treat the Earth’s gravity field as random because there is only one Earth.
    - Phil on January 23, 2022 12:49 AM at 12:49 am said:
      
      David Marcus says
      “> 1. The other envelope is equally likely to contain 2A or A/2.
      Of course, that is the error.”
      
      Yes! Give that man a cigar. 2A must be less likely than A/2, on average. There can be some A for which 2A is more likely, and there can be many A for which 2A and A/2 are both equally likely, but these must be counterbalanced by some for which A/2 is more likely.
      
      I’ll go ahead and say a bit more, I’m guessing there are readers here who haven’t thought about this problem.
      
      I think it’s instructive to think about two superficially similar situations that result in different answers.
      
      Situation 1 is the classic two-envelopes problem: someone offers you the choice of two envelopes. They tell you (and you believe them) that one contains twice as much money as the other. You pick one. They offer to let you switch. Should you?
      
      In situation 2, someone gives you an envelope that contains some amount of money. You don’t know how much there is. They offer to let you switch to another envelope that is equally likely to contain half the amount, or twice the amount. Should you?
      
      In Situation 1 the answer is that (on average) it doesn’t matter if you switch. In situation 2 you should switch because on average you stand to increase your take by 25%. Once you really understand, deep down inside, why these situations are different, you understand the problem.
    - Anoneuoid on January 23, 2022 3:40 AM at 3:40 am said:
      
      I’m with you Daniel, it baffles me that this is a thing. Phil wrote:
      
      Situation 1 is the classic two-envelopes problem: someone offers you the choice of two envelopes. They tell you (and you believe them) that one contains twice as much money as the other. You pick one. They offer to let you switch. Should you?
      
      In situation 2, someone gives you an envelope that contains some amount of money. You don’t know how much there is. They offer to let you switch to another envelope that is equally likely to contain half the amount, or twice the amount. Should you?
      
      In Situation 1 the answer is that (on average) it doesn’t matter if you switch. In situation 2 you should switch because on average you stand to increase your take by 25%. Once you really understand, deep down inside, why these situations are different, you understand the problem.
      
      Alice and Bob learn they are getting an inheritance from an eccentric uncle. Both get a choice between two checks corresponding to x or 2x dollars.
      
      Bob gets to see a safe and is told he can pick from two identical envelopes but one contains a check with twice the value of the other.
      
      Alice is given an envelope and shown a safe containing a second envelope. She is told she can keep the one in her hand or swap it for the one in the safe that may be either worth twice as much or half as much as the one in hand.
      
      Alice puts the one she was given into the safe and has someone mix them up.
      
      Now Alice and Bob have the same choice.
    - Mendel on January 23, 2022 4:37 AM at 4:37 am said:
      
      a) you have convinced yourself that an envelope cannot contain less than $0.01 (no ha’penny coin) or more than $1000 (exceeds experimenter’s budget). You open the envelope you are given and see $120. Is this subjective gain of knowledge helping you make a decision? because subjectively, you now know that it’s equally likely that the other envelope has $60 or $240 in it.
      
      b)
      P(B=240|A=120)=0.5
      P(B=60|A=120)=0.5
      To arrive at the correct solution (it doesn’t matter if you switch), 240×0.5×prior(240)=60×0.5×prior(60), with prior(60)/prior(240)=4. This argument leads you to a prior where one of the envelopes likely contains $0.01 and the other $0.02.
      
      c) [modifying an old paradox involving an unexpected execution] you know there can’t be less than $0.01 in any envelope because the smallest coin is $0.01. You also know that no envelope can contain $0.01 because if you chose that envelope first, you’d know for certain you should switch. If there is no $0.01, then there can’t be $0.02 either, because then you’d also know you must switch. $0.03 is right out because it’s odd. Continue to prove by induction that your prior for every possible amount of money being in the envelope is 0.
    - Carlos Ungil on January 23, 2022 7:40 AM at 7:40 am said:
      
      Mandel, what you say seems wrong but maybe that’s your point?
      
      a) “you now know that it’s equally likely that the other envelope has $60 or $240 in it”
      
      You don’t know that. At least not in general, though you may know that in the particular case where you knew that both pairs {$60, $120} and {$120, $240} were equally likely a priori.
      
      b) “to arrive at the correct solution (it doesn’t matter if you switch)”
      
      That’s the correct solution only in the original problem where you pick an envelope but you don’t learn what’s inside.
      
      c) “You also know that no envelope can contain $0.01 because if you chose that envelope first, you’d know for certain you should switch.”
      
      So what? See b).
    - David Marcus on January 23, 2022 8:49 AM at 8:49 am said:
      
      I think the lesson is that you have to be careful with your definitions when calculating probabilities. Daniel says that A is the amount of money in the lesser envelope. Phil says that A is the amount of money in the envelope that you pick. These are not the same, but it is easy to confuse them.
    - Ricardo Silva on January 23, 2022 11:01 AM at 11:01 am said:
      
      Well, just to add an epsilon to the discussion, indeed the flaw that should be obvious is that P(quantity in the other envelope = 2A | quantity in the chosen envelope = A) = 0.5 is the clearly wrong step, because the quantities in the envelopes are not independent if they add up to any finite amount. If we are really set on modeling the total amount itself as a random variable, then instead of committing to one of the two versions of the events (“I pick the envelope with the smallest value” vs “I pick an envelop with value A”), we have three random variables (T, total amount; C, value in chosen envelope; O, value in the other envelope) where P(C = c | T = t) = [0.5, 0.5] with support [t/3, 2t/3] and P(O = 2t/3 | C = t/3, T = t) = 1 and P(O = t/3 | C = 2t/3, T = t) = 1. The only unknown is the distribution of T, and the Bayesian take on the problem boils down to the marginal likelihood P(T > c | C = c), integrating away any prior on p(t). Before even attempting that, it may be easier to work with the model for p(t | c) instead, because for starters no model for the marginal of C will be necessary. But p(t | c) is always [0.5, 0.5] with support in [3c/2, 3c] for any c. *So there are no unknowns* and this is not a problem of statistical inference of any kind, making the discussion of Bayesian vs frequentist completely pointless.
    - Ricardo Silva on January 23, 2022 11:07 AM at 11:07 am said:
      
      Sorry, instead of P(T > c | C = c), it should be P(O > c | C = c)…
    - somebody on January 23, 2022 11:34 AM at 11:34 am said:
      
      @Mendel
      
      b)
      P(B=240|A=120)=0.5
      P(B=60|A=120)=0.5
      To arrive at the correct solution (it doesn’t matter if you switch), 240×0.5×prior(240)=60×0.5×prior(60), with prior(60)/prior(240)=4. This argument leads you to a prior where one of the envelopes likely contains $0.01 and the other $0.02.
      
      This is completely wrong. In that situation, switching does yield the higher expected value. You’re conflating two separate decision problems,
      
      1. Trying to pick the envelope with the highest amount
      2. Trying to maximize expected value
      
      You can simulate yourself if you need convincing. But consider the simpler system where you’re given $120 and someone offers to flip a coin and double it for heads or half it for tails. Taking the bet yields a higher expected value and almost surely comes out ahead asymptotically if you repeat this experiment from the beginning.
      
      There’s no advantage if all you care about is “winning” in the binary sense, and if you chain this experiment infinitely, risking your result from the previous iteration instead of starting from $120, always taking the risk almost surely yields 0 in the limit, but those are entirely different questions from the expected value.
    - Phil on January 23, 2022 1:13 PM at 1:13 pm said:
      
      I think the fact that there’s a long Wikipedia article about the Two Envelopes problem shows that there’s something worth thinking about. Otherwise a short article would do.
      
      One envelope contains twice as much as the other. I pick one. I don’t open it; I have no idea how much money is inside. Is it not the case that the other envelope is equally likely to have half as much or twice as much? If I chose the lower one then the other contains twice as much. If I choose the higher then the other contains half as much.
      
      It may be the case that there are some people who instantly see the problem. I can say is that when I first heard the problem I thought about it for over an hour before I understood it. I mean, of I saw instantly that switching can’t matter on average, that’s obvious, but it took me quite a bit of thinking to understand what was wrong with the counter argument. I’ll also say that I have since introduced this to around ten or twelve people and not one of them understood it instantly…although of course they, too, realized switching can’t matter.
      
      Those of you who do understand instantly what’s wrong with the argument that the other envelop is equally likely to contain half as much or twice as much as what I have in the envelope, congratulations, you are in a small minority.
    - Andrew on January 23, 2022 1:22 PM at 1:22 pm said:
      
      Phil:
      
      Yeah, if all the stuff I was good at was easy for everybody, I’d have to work a lot harder to stay employed!
    - somebody on January 23, 2022 1:42 PM at 1:42 pm said:
      
      @Mendel
      
      Julia simulation that matches your setup exactly
      https://pastebin.com/TXFNS7zm
      
      One strategy switches only if the first pick is guaranteed to be smaller, the other switches as long as it’s not guaranteed to be larger. The switching strategy has a mean of ~468, the other has a mean of ~437.
      
      @Phil
      
      For your original setup (no looking), I think it’s hard if you try to compute the expected value of the envelope and easy if you try and compute the expected loss from switching like Daniel. I think the former approach is what I would reach for first even though it makes this harder, but also I think the presentation of the problem can make this much harder or much easier. The way the question is typically asked in person is to present an incorrect solution and silently conflate conditional and total expectations. If you just read the first half of the problem, the set-up with the envelopes, and don’t read the wrong solution at all, you’re much less likely to be confused. It’s like this:
      https://en.wikipedia.org/wiki/Missing_dollar_riddle
    - Keith O'Rourke on January 23, 2022 1:48 PM at 1:48 pm said:
      
      Andrew +1 (a lot of lengthy discussion does mean the meaning (resolution) needs be complicated).
    - somebody on January 23, 2022 2:19 PM at 2:19 pm said:
      
      Explaining why a wrong answer is wrong can be harder and more instructive than getting the right answer. That’s the fun of pseudoparadoxes.
    - Ricardo Silva on January 23, 2022 7:00 PM at 7:00 pm said:
      
      Phil,
      
      The thing is, you wrote the problem so neatly with steps 1/2/3/4, where 2/3/4 are so automatic, that it left step 1 was the clear culprit (not to mention that I was traumatised enough from my string of B grades in my postgrad statistics classes, where more than once I was fooled to assume two variables were independent without paying attention to their support…). If I had started from “One envelope contains twice as much as the other… If I choose the higher then the other contains half as much.” somehow I think I’d be far more confused.
      
      Just saw that Wikipedia page on the two envelope problem… Wow. To quote Daniel: “WTF?”
    - Keith O'Rourke on January 23, 2022 7:29 PM at 7:29 pm said:
      
      > does mean the meaning (resolution) needs be complicated
      
      ARGHHH!!! does NOT mean the meaning (resolution) needs be complicated
    - Phil on January 24, 2022 2:29 PM at 2:29 pm said:
      
      In the context of the discussion here, I especially like this part of the Wikipedia page:
      
      “The puzzle: The puzzle is to find the flaw in the very compelling line of reasoning [that suggests switching is beneficial on average]. This includes determining exactly why and under what conditions that step is not correct, in order to be sure not to make this mistake in a more complicated situation where the misstep may not be so obvious. In short, the problem is to solve the paradox. Thus, in particular, the puzzle is not solved by the very simple task of finding another way to calculate the probabilities that does not lead to a contradiction.
      
      Multiplicity of proposed solutions
      Many solutions have been proposed. Some simple, some very complex. Commonly one writer proposes a solution to the problem as stated, after which another writer shows that altering the problem slightly revives the paradox. Such sequences of discussions have produced a family of closely related formulations of the problem, resulting in a voluminous literature on the subject.[3] To keep this article short, only a small fraction of all proposed ideas for a solution are mentioned below.
      
      No proposed solution is widely accepted as definitive. Despite this it is common for authors to claim that the solution to the problem is easy, even elementary. However, when investigating these elementary solutions they often differ from one author to the next.”
Deborah Mayo on January 20, 2022 9:17 PM at 9:17 pm said:

Here is my editorial published in Conservation Biology:
https://conbio.onlinelibrary.wiley.com/doi/full/10.1111/cobi.13861

We recently had a special session of the Phil Stat (remote) forum, with Y. Benjamini and D. Hand, to discuss both it and the ASA President’s task force on statistical significance and replicability.
I temporarily stopped posting the few remaining commentaries because of the sad news of the death of Sir David Cox.

Reply ↓
- Sameera Daniels on January 20, 2022 9:54 PM at 9:54 pm said:
  
  Glad to see you posting here Deborah. I enjoy those Phil Stat forums. The one with Benjamini and Hand was particularly engaging. I admire your sustained passion and pursuit of clarity.
  
  I find myself aligned with Andrew about null hypothesis statistics testing. And the more I read about p-values, the less I think I understand their utility.
  
  Reply ↓
Sameera Daniels on January 20, 2022 9:23 PM at 9:23 pm said:

I was able to attend Deborah Mayo’s online forum, which is a treat for me b/c I miss having direct conversations with intellectuals. Any of you are, I’m sure, welcome.

It would be great to see Andrew, Sander Greenland, Keith O’Rourke, Daniel Lakeland, and others speak at Deborah’s online Forum.

Reply ↓
Deborah Mayo on January 20, 2022 9:32 PM at 9:32 pm said:

The commentator on “poisoned priors” was oncologist John Park, not Brian Dennis. Park’s comment is here:
https://errorstatistics.com/2022/01/17/john-park-poisoned-priors-will-you-drink-from-this-wellguest-post/

Reply ↓
- Andrew on January 20, 2022 9:45 PM at 9:45 pm said:
  
  Fixed; thanks.
  
  Reply ↓
  - Joe on January 22, 2022 3:07 PM at 3:07 pm said:
    
    Ah, that explains it. You corrected the name the first time you referred to him but not the second. I was wondering who “Dennis” was in “As Dennis notes…”
    
    Reply ↓
    - Andrew on January 22, 2022 3:25 PM at 3:25 pm said:
      
      ok, I fixed that too now!
Anoneuoid on January 20, 2022 9:37 PM at 9:37 pm said:

From the blog post:

Fisher himself, inventor of P values and a considerable portion of other statistical methods used by generations of ecologists, helped ecologists quantify patterns of biodiversity (Fisher et al. 1943)

This Fisher 1943 paper is a perfect example of a non-NHST use of p-values. He calculates the p-values for *the output of his theoretical model*.

https://www.jstor.org/stable/1411

This *is* science and exactly the opposite of NHST, where you expect to observe something but test a hypothesis of no correlation/difference that has nothing to do with your hypothesis. That just happens to be 99.99+% of the use of p-values.

The problem is not p-values, it is the null hypothesis. Until this is fixed there is no need for further discussion about p-values.

Reply ↓
Christian Hennig on January 21, 2022 5:10 AM at 5:10 am said:

I hope I can shamelessly promote my contribution here, it’s at
https://errorstatistics.com/2022/01/09/the-asa-controversy-on-p-values-as-an-illustration-of-the-difficulty-of-statistics/
Just in case anyone is interested. Please feel free to comment & respond.

Reply ↓
- Andrew on January 21, 2022 7:27 AM at 7:27 am said:
  
  Christian:
  
  Thanks for the link. I love your discussion and will devote a separate post to it!
  
  Reply ↓
Rahul on January 21, 2022 5:28 AM at 5:28 am said:

It’s all good to say we should use costs and benefits but what fraction of papers actually go there?

Mostly people use a prior do a Bayesian analysis, publish and end of story.

How many actually grapple with the cost benefit issue? It’s usually left for someone else to do, which rarely happens.

Reply ↓
David Marcus on January 21, 2022 9:09 AM at 9:09 am said:

> I don’t think any statistical method should be banned.

Fine. But, just because textbooks or professionals suggest using a statistical method (or approach) or the method has been used for many years, doesn’t mean it is a good method. So, the world would be a better place if authors didn’t use bad methods, and journals didn’t publish such papers. Calling a bad method a “philosophy” doesn’t make it a good method. It is just camouflage: Philosophies can’t be wrong, can they? Actually, they can.

We don’t ban religions, but that doesn’t mean what religions say is true.

Reply ↓
jd on January 21, 2022 9:20 AM at 9:20 am said:

There is a gas leak, the warehouse is burning down, and the fire chiefs stand about arguing whether to use the ladder truck or the helicopter with bucket (in my experience, Bayes is def the helicopter).

I’m not worthy to debate the topics here, but I am ‘down in the trenches’ as it were, doing data analysis for a large variety of projects and researchers (from psychologists to physicians to basic scientists). I’ve only been at this about 5 years, but what I have seen is that there is a deep misunderstanding about what statistics (of any persuasion) can provide, and no statistics will ever provide what is wanted. Because what is wanted is a delusion of grandeur – really some sort of diviner – that confidently declares ‘finding’ from ‘no finding’. The gas leak in this fire is the perverse incentives that necessitate the need for constantly coming up with big ‘findings’ in order to continue to do research.

So perhaps a better solution is finding how many gas lines connect to the building and calling the respective gas companies? Doesn’t sound like an easy answer, but it may be a more upstream solution than using the ladder truck.

As far as ladder trucks vs helicopters go, once I’ve explained the correct definitions of p-values, confidence intervals, and Bayesian credible intervals to researchers, the Bayesian persuasion typically is much closer to being useful. And as a fireman myself, I find the helicopter a much more flexible solution in difficult to reach fires, because ladder trucks require roads, but helicopters can fly ;-)

Reply ↓
- Anoneuoid on January 22, 2022 1:56 PM at 1:56 pm said:
  
  Because what is wanted is a delusion of grandeur – really some sort of diviner – that confidently declares ‘finding’ from ‘no finding’. The gas leak in this fire is the perverse incentives that necessitate the need for constantly coming up with big ‘findings’ in order to continue to do research.
  
  They need to come up with a quantitative prediction and test that. Then all those perverse incentives begin acting in reverse (as beneficial incentives). The researchers will now do all they can to get an “insignificant” result like seeking out all the sources of systematic error, etc.
  
  The root cause of the problem was figured out way back in 1967:
  
  Meehl, P.E. (1967) Theory-Testing in Psychology and Physics: A Methodological Paradox. Philosophy of Science, 34, 103-115.
  https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.693.8918&rep=rep1&type=pdf
  https://doi.org/10.1086/288135
  
  Reply ↓
John Park, M.D. on January 21, 2022 2:49 PM at 2:49 pm said:

Prof. Gelman, Thank you for interacting with my work. It is quite an honor sir.

When you noted “My response to this is that all parts of an analysis, including data model, prior distributions, and estimates or assumptions of costs and benefits, should be explicitly justified. Conflict of interest is a real problem no matter what, and I don’t think the solution is to use a statistical approach that throws away data.”

My whole point was that in medicine, this may not be able to be done well, and practically speaking, if main expert conflicts and biases are so bad, why use priors from someone with less expertise. Frankly, their data should be “thrown away” so to speak b/c it’s no good.

If it can, I suppose, Bayesian methods need to provide something our typical freq. studies cannot provide. Are there faults yes! If we physicians & scientists are no good at stats and p-hack it is certainly not NHST, p-values, and esp. an error statistician’s fault! If you think p-hacking is bad, wait till you put Bayesian methods into the same data dredging hands – will be a nightmare.

In earlier phase studies looking for signals, observational studies, and retrospective studies, Bayes certainly has advantages. In rare cancers or for emergencies where only fast action (think beginning stages of COVID) then I think it makes little sense to randomize 10 people, but to use the best priors we have with Bayesian methods. I have an article cooking on this that hopefully will be published soon.

Reply ↓
- David Marcus on January 22, 2022 8:20 AM at 8:20 am said:
  
  There is a pervasive mistaken belief that frequentist methods “don’t use priors”. In fact, it is impossible to correctly analyze data without using a prior. For many frequentist methods, the method corresponds to a Bayesian method with a particular prior. If you don’t mention this prior, it doesn’t mean that your analysis isn’t using this prior. Unfortunately, some frequentist methods don’t correspond to any Bayesian method. These are usually illogical or wrong.
  
  Of course, using a sound framework doesn’t guarantee that it will be used well. But, better to start with a sound framework than hope you can teach people to use an unsound one well.
  
  An advantage of Bayesian methods is that they don’t hide important assumptions by pretending that they aren’t being used.
  
  Reply ↓
  - John Park, M.D. on January 22, 2022 4:55 PM at 4:55 pm said:
    
    Thanks for the response. I have hear this critique before that frequentists “use priors.” Do you mean bringing in large amounts of background information that effects targets (eg are we looking for a 5% difference or 20%) and model selection (Kaplan Meier, Cox PH, hierarchical ordering, etc.). This is not like the Bayesian prior (Beta, gamma, normal, uniform, etc.), combining it with the data, and then computing a posterior. Bayesian, also claim the error probability can be captured by this and with more sophisticated methods using MCMC simulations. I am not convinced the “freq. prior” is analogous.
    
    Reply ↓
    - David Marcus on January 23, 2022 9:04 AM at 9:04 am said:
      
      John Park, M.D., wrote:
      
      > Do you mean bringing in large amounts of background information that
      > effects targets
      
      No, I do not mean that. I mean that (as Christian says) the frequentist says that they are calculating something different. But (in most cases), people don’t want what the frequentist is calculating. They really want to know what the Bayesian is calculating. So, the fequentist is pulling a sleight of hand: You want to know what X is? Fine. I can tell you what Y is without using a prior. And, X and Y sound similar so the person doesn’t realize that they aren’t getting X.
      
      That’s why people keep explaining P-values wrong. Telling them “clearly” (as Christian suggests) that they are wrong doesn’t help. What does help is to tell them how to calculate the thing that they really want.
      
      It really isn’t hard to see that frequentist analysis is not what people want: The dependence on the stopping rule. The fact that the same data could come from an ESP experiment, an expert telling you who wrote a piece of music, or a lady tasting tea.
    - Christian Hennig on January 23, 2022 9:22 AM at 9:22 am said:
      
      @David: Actually my experience is that what many people “really want” is a Bayesian probability for a parameter (or a set of parameters) to be true without the need to specify a Bayesian prior. “The best of both worlds” ;-)
      
      I also find dubious the claim that Bayesians can give people properly what they want, because I believe people want to know what the true parameter of the data generating process is (or if you can’t know it precisely, a probability for that), but most Bayesians interpret probability in an epistemic way, which means that such a true parameter doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is. Believing in the existence of a true parameter implies a frequentist interpretation of probability.
      
      Personally, by the way, I’d agree with de Finetti, Box, and others, that true models and true parameters don’t exist, and I’d rather see both frequentists and Bayesians acknowledging this (this doesn’t necessarily mean that I need to use epistemic probability, see https://arxiv.org/abs/2007.05748), rather than pretending that any analysis could give people “what they want”.
    - David Marcus on January 23, 2022 10:07 AM at 10:07 am said:
      
      Christian Hennig wrote:
      
      > many people “really want” is a Bayesian probability for a parameter (or a
      > set of parameters) to be true without the need to specify a Bayesian
      > prior.
      
      Yes. But, it is easy to see that that is impossible (lady tasting tea). The choice then becomes one between a probability for the parameter or a probability for something else that sounds like what the person asked for but is completely different.
      
      Frequentists should give people an honest choice: Would you like a probability for a parameter where we have to discuss the prior? Or, would you like a probability for something else, which isn’t the probability you asked for, and where you have to tell me your stopping rule in advance?
      
      Another example is least squares. This is easy to justify if things are normal. But, people say if you do least squares you don’t have to assume normality. OK, but if things were really far from normal, how would you justify using squared error? Just because you don’t mention something doesn’t mean you aren’t assuming it.
      
      I’m sorry, but I don’t understand your two other paragraphs. Are you saying that all probabilities are of the same kind? That’s not true. We use probability to model physical processes, and we also use probability to model uncertainty.
    - Christian Hennig on January 23, 2022 1:05 PM at 1:05 pm said:
      
      @David: My point is that if people want probabilities regarding the true parameter, this implies that a true parameter exists, which many Bayesians deny, and for good reasons. Using an epistemic interpretation of probability (which is Bayesian mainstream as far as I know), probabilities are *not* about any “true parameter”, but rather about implications of the researchers’ beliefs, to be cashed out predicting future observations. My impression is that a Bayesian approach is often sold to people as providing probabilities regarding true parameters, which in my view is as problematic as many of the misinterpretations of p-values that we see. The idea that the model refers to a data generating mechanism that exists in reality is a frequentist one, and therefore being interested in the “true parameter” for such a mechanism is an implicitly frequentist concept.
      
      A frequentist test is about finding out whether a model is compatible with the data, which does not imply that the model and the true parameter in fact exist, so it has a valid interpretation even if we don’t believe in the model.
      
      I should add though that of course frequentist tests are taken to be about existing true parameters by many, and that Bayesian analysis is not incompatible with a frequentist idea of what a “true model” is supposed to mean (but this goes against much Bayesian philosophy).
    - Carlos Ungil on January 23, 2022 2:15 PM at 2:15 pm said:
      
      Christian:
      
      > Using an epistemic interpretation of probability, probabilities are *not* about any “true parameter”, but rather about implications of the researchers’ beliefs, to be cashed out predicting future observations.
      
      Why can’t probabilities represent the researcher’s beliefs about some “true parameter”? It will be misleading in that case to say that they ‘probabilities are not about any “true parameter”‘. What would they about?
      
      > The idea that the model refers to a data generating mechanism that exists in reality is a frequentist one, and therefore being interested in the “true parameter” for such a mechanism is an implicitly frequentist concept.
      
      I was not aware that using models to relate observations to some underlying reality is by definition an act of frequentism. That makes the label much less interesting though and renders the Bayesian / frequentist divide almost meaningless.
    - David Marcus on January 23, 2022 4:30 PM at 4:30 pm said:
      
      Christian,
      
      I don’t think getting into a philosophical discussion about the meaning of “true” is helpful. A parameter represents something meaningful to the experimenter in the world. People using statistics want to know something about that thing. So, they need a procedure that tells them something about that thing, including how accurately they have been able to determine the thing. They don’t need probabilities about something else entirely.
      
      The fact that the experimenter has probably simplified how the world works doesn’t change what the experimenter is interested in. The fact that models are not perfect is one reason why we have to think about both the likelihood and the prior and how they interact when developing a model.
    - Christian Hennig on January 23, 2022 4:51 PM at 4:51 pm said:
      
      @Carlos
      > Why can’t probabilities represent the researcher’s beliefs about some “true parameter”? It will be misleading in that case to say
      > that they ‘probabilities are not about any “true parameter”‘. What would they about?
      
      They can, as far as I’m concerned. However, if this is the true parameter of a true probability model, that probability model isn’t epistemic, and is something that de Finetti and many other Bayesians state does not exist.
      
      > I was not aware that using models to relate observations to some underlying reality is by definition an act of frequentism.
      > That makes the label much less interesting though and renders the Bayesian / frequentist divide almost meaningless.
      
      “Relate” can mean many things. It is frequentist if the probability model in question is used to model the real data generating process and is interpreted as generating relative frequencies corresponding to the probabilities in the long run (that’s where the term “frequentism” comes from). Leading Bayesians such as de Finetti or Phil Dawid have emphasised that probability in this sense “does not exist”, and probability is to be used as expression of epistemic uncertainty. They wouldn’t agree with you that the divide, interpreted in this sense, is meaningless.
  - Christian Hennig on January 23, 2022 6:26 AM at 6:26 am said:
    
    I think it’s inappropriate to claim that if a frequentist method can formally written down as a Bayesian method with a certain prior, then implicitly the frequentist method “assumes” the prior and the Bayesian method is just more explicit about it. The interpretation of the frequentist output is not the same, so even though there may be mathematical equivalence, the prior is not used in the same way. The frequentist doesn’t make any statements about a probability of any parameter or set of parameters being true, which is what the prior is used for in the Bayesian framework, therefore the frrequentist doesn’t “assume” the prior in the way the Bayesian does. It would be more appropriate to say that the Bayesian can numerically reconstruct the frequentist analysis using a particular prior, but also then the difference in interpretation doesn’t go away.
    
    I grants the Bayesian that many people misinterpret frequentist analyses in a Bayesian manner, but I think that they should be clearly told that this is wrong, rather than justifying it by stating that instead of doing a wrong frequentist analysis, they do a correct Bayesian one with a hidden prior.
    
    Reply ↓
    - Andrew on January 23, 2022 9:26 AM at 9:26 am said:
      
      Christian:
      
      As Rubin put it, “Bayesian” is an approach for making statistical inferences, “Frequentist” is an approach for evaluating statistical inferences. The term “frequentist method” isn’t so clearly defined: its meaning is any statistical method that is deemed to have good frequentist properties, or maybe just any statistical method that doesn’t use formal prior distributions, or something like that. Just as any statistical method can be interpreted in terms of Bayesian inference (even if only to say that it’s not Bayesian in that it violates Bayesian principles in some way), conversely, any Bayesian method can be evaluated in terms of various frequentist principles. As I’ve written, Bayesians are frequentists.
    - Ricardo Silva on January 23, 2022 11:26 AM at 11:26 am said:
      
      Hi, Christian
      
      I’m all for Bayes and likelihoods, but I agree this is silly. It even goes beyond priors, with people saying that we are “assuming a likelihood” when using particular model-free estimators. Every time someone (typically, a Bayesian machine learner) tells me that using regression with least squares is equivalent to assuming a Gaussian distribution of the errors, I ask back whether using the sample average as an estimator of the mean also assumes Gaussianity – since that’s also a solution to a least squares problem!
    - Christian Hennig on January 23, 2022 12:52 PM at 12:52 pm said:
      
      @Andrew: I’d pretty much agree with this, I used “frequentist method” above to refer to what most people would call a frequentist method, but I also think that Bayesian inference can be done with a frequentist interpretation of probability, and with good frequentist performance, depending on the situation of course (as actually explained in our paper).
      
      @Ricardo: Nice to read you!
    - David Marcus on January 23, 2022 12:55 PM at 12:55 pm said:
      
      Ricardo,
      
      If you have formula X that gives the solution (maybe approximately) to problem A, then the hypotheses in problem A tell you sufficient conditions for formula X to give you the correct answer. If you really want the solution to a different problem B, and B has a very different solution, then it doesn’t matter how you arrived at X, it isn’t the right formula to use.
      
      Another example is the Kalman filter. You can derive it assuming normality and using conditional probability (which is how Kalman thought of it). But, you can also derive the formulas using least squares. But, if things are far from normality, it probably won’t work well, even though you didn’t assume normality when you derived it using least squares.
      
      The problem is that the fequentist approach doesn’t fully specify the problem. The parameters are fixed unknowns, but the goal is estimates with uncertainty. And, you need probability to model uncertainty. Once you fully specify the model, then you can see what hypotheses you really need to get the answer. And, you can also see what happens if you change the hypotheses.
    - Keith O'Rourke on January 23, 2022 1:43 PM at 1:43 pm said:
      
      Ricardo > the sample average as an estimator of the mean also assumes Gaussianity
      Well that is how Gauss came up with the Gaussian distribution – what distribution justifies the mean as the best combination?
      
      The 1911 paper of Keynes mentioned above, acknowledged and revisited Gauss’ derivation of the Normal distribution as the only symmetric distribution whose best combination was the mean and also investigated this for the median and the mode.
      
      https://phaneron0.files.wordpress.com/2015/08/thesisreprint.pdf
    - Ricardo Silva on January 23, 2022 6:38 PM at 6:38 pm said:
      
      Thanks for this, Keith. I’m feeling a bit dense and didn’t quite get what you meant by “best combination” – it will be worthwhile for me to check out Keynes’ and Gauss’ original formulation (I just had Laplace’s in mind, i.e. the normal distribution as motivated by the CLT).
      
      None of this of course detracts from the fact that it would be bananas to say that we are implicitly assuming Gaussianity when using the sample average as an estimator of the mean… Something like Glivenko-Cantelli to justify the sample average makes far far more sense. But thanks again for sharing the point, it’s always very nice to keep learning new things about something so taken-for-granted such as the Gaussian distribution!
      
      And following on David Marcus’ point, I fully agree that this should never grant anyone the power to say that it doesn’t matter what the generative model of a problem would be, even if we don’t want the baggage of a model-based estimator. If we were told e.g. that the distribution would be a mixture of Gaussians with a good gap between them, we might justifiably argue why on earth we would be interested in learning the mean of the distribution (off-the-shelf utility functions be damned, we might plausibly question the sanity of someone who thinks minimising sums of squares is a good idea in this case). There might be valid reasons to want the mean, but it wouldn’t be automatically obvious and not grant anyone an assumption-free ride.
    - Christian Hennig on January 24, 2022 6:08 AM at 6:08 am said:
      
      Note sure whether anyone still reads this thread, but the issue of “model assumptions” is a subtle but very central one. If we say that LS-estimators such as the mean “assume” normality, technically this means that under normality there is an optimality result for the LS-estimator (due to the fact that the squared loss function is the normal loglikelihood actually). It for sure doesn’t mean that the LS-estimator doesn’t work or can’t be used in any other situation. To claim such a thing would be generally silly because “all models are wrong” and we couldn’t do anything in statistics if model assumptions needed to be literally fulfilled. In fact there are a lot of (modelled) situations in which we know that the LS-estimator is pretty good despite not being optimal (finding something optimal in reality would require knowing the “true” distribution, which doesn’t exist). Knowing that the LS-estimator is optimal under normality contributes to knowledge and understanding of the LS-estimator, but doesn’t in any way constrain its use to normal distributions (granted that there are certain non-normal distributions under which LS-estimators are indeed bad). The squared loss function can be justified in many applications without any reference to normality, and (to give another, if somewhat trivial, example of a justification of a method without reference to distributional assumptions) the mean can be understood as distributing a sum equally among the contributors, which has a relevant meaning for example when it comes to the distribution of budgets.
      
      Regarding the assumption of a Bayesian prior, making a posterior statement about the probability where the parameter is to be found formally relies strongly on the prior, but of course also here one can argue that this statement may (or may not) be fairly robust under replacement of the prior and/or the likelihood with something else that would have an about as good justification for the situation in hand – a certain degree of robustness is absolutely essential for applicability in practice, because “all models are wrong” (and be they epistemic ones). Similarly, frequentist inference, tests, confidence intervals rely on the model but can be shown to be robust (or not) to misspecification (depending on how exactly the model is misspecified of course). The frequentist statement doesn’t involve a prior though, so there is one thing less to worry about when it comes to model misspecification, and this is not touched by whether formally the frequentist inference can be equivalently written down as a Bayesian one with a certain prior.
    - Andrew on January 24, 2022 1:52 PM at 1:52 pm said:
      
      Christian:
      
      Your statement, “The frequentist statement doesn’t involve a prior though, so there is one thing less to worry about when it comes to model misspecification,” reminds me that saying about straining at the gnat of the prior distribution while swallowing the camel that is the likelihood.
      
      At a practical level, we can think of a strong prior as a form of regularization, not an “assumption” as much as a device that allows us to fit a bigger, more realistic model of the data. A little bit of prior can allow us to relax a lot of assumption in the data model or likelihood.
    - Christian Hennig on January 24, 2022 4:19 PM at 4:19 pm said:
      
      Andrew: Well, the likelihood is to be swallowed by both the Bayesian and the frequentist. I see what you’re getting at. I wouldn’t frame it in terms of “relaxing assumptions”, but then I don’t think “assumptions” are really assumed in statistics anyway (in the sense that they need to be fulfilled), see above.
      
      To me it looks like this:
      Likelihood only => lighter assumptions than likelihood + prior,
      …however I agree that…
      More complex supermodel with prior of which simpler model is a special case => lighter assumptions than simpler model only (implicitly giving it prior probability 1)
    - David Marcus on January 25, 2022 9:10 AM at 9:10 am said:
      
      Christian:
      
      I’ll state it more simply. Freqentist analyses manage to avoid specifying an important part of the model (the distribution of the parameter) by answering a different question than the one that was asked.
      
      Of course, you can use any formula you like in any situation. But, until you think about how close your model (the whole model, not just the parts you forgot to mention) is to reality, you won’t be able to decide whether doing so is a good idea.
- Keith O'Rourke on January 22, 2022 8:50 AM at 8:50 am said:
  
  From my 1996 editorial arguing that the clinical research needed to step on on these issues.
  
  “In terms of dealing with the arbitrariness of prior specification in RCTs, I agree with the author that it is critical that peer review protocol be developed and disseminated through the RCT community so that benefits can be better obtained when the Bayesian approach is adopted. The merit of the Bayesian approach is in the quality of the prior judgment (probabilities) not in any inherent sensibility. When a posterior is presented, I believe it should be clearly and primarily stressed as being a “function” of the prior probabilities and not the probability “of treatment effects.” It is easy to
  bash the arbitrariness of conventions/selections of hypotheses and it is easy to bash the
  arbitrariness of priors. To encourage researchers to deal more wisely with uncertainty
  flexible thinking and less bashing seems preferable.”
  
  Two cheers for Bayes. O’Rourke K. (1996) Controlled clinical trials, 17 (4) , pp. 350-352
  (Only 4 citations from people who knew me – horribly written or just not something clinical researchers want to deal with?)
  
  Reply ↓
  - John Park, M.D. on January 22, 2022 5:01 PM at 5:01 pm said:
    
    Keith. I would love to read it, but it’s behind the dreaded paywall… do you have a link to a freely available copy? To directly answer your question there are about 5 physicians who want to deal with this, I am one of them, and I started high school in 1996 when it was published LOL so I doubt it’s because it was horribly written!
    
    Reply ↓
Deborah Mayo on January 21, 2022 7:15 PM at 7:15 pm said:

The commentary by lawyer Nate Schachtman:
https://errorstatistics.com/2022/01/18/nathan-schactman-of-significance-error-confidence-and-confusion-in-the-law-and-in-statistical-practice-guest-post/
includes (at the bottom) links to all the previous commentaries up to that point, if you’re interested. The topic of my editorial is, as the title says, “The Stat Wars and Intellectual Conflicts of Interest”–and their dangers– more than on statistical significance tests or p-values themselves. One example discussed is the American Statistical Association’s Presidential Task Force on Statistical Significance and Replicability which was appointed in 2019 to combat the confusion stemming from the editorial, Wasserstein et al (2019), regarding the ASA’s position on statistical significance and p-values.
The previous commentaries are by:
Park
Dennis
Stark
Staley
Pawitan
Hennig
Ionides and Ritov
Haig
Lakens

Reply ↓
- Andrew on January 22, 2022 8:24 AM at 8:24 am said:
  
  Deborah:
  
  Did you choose these people to discuss your article, or did you post the article and then these were the people who saw your post and chose to write something?
  
  Reply ↓
BenjaminKay on January 22, 2022 7:31 AM at 7:31 am said:

“My response to this is that all parts of an analysis, including data model, prior distributions, and estimates or assumptions of costs and benefits, should be explicitly justified.”

I’m an economist and I see some Bayesian papers in the course of my work. I can’t say that I’ve ever seen a well justified prior distribution for all the parameters of a model. One or two might be grounded in a meta analysis. But the closest I’ve seen to a justification for all parameter priors is someone saying that the distributions they choose have nice properties or are the same priors someone else chose. Certainly not a serious justification, more like a nod in the direction of justification.

Done thoroughly, doesn’t this take up a huge amount of space? It is common for models with fixed effects and controls to have many parameters. A serious justification of prior distributions (and parameters for those priors) for those parameters seems like a considerable exercise.

Reply ↓
- Andrew on January 22, 2022 8:23 AM at 8:23 am said:
  
  Benjamin:
  
  I agree. Justifying a model is typically more of an ideal than something that is actually achieved. In real life, just about all the parts of a model should be considered tentative. I am bothered that skeptics and subjectivists alike swallow the camel that is the likelihood while straining on the gnat of the prior distribution.
  
  Reply ↓
- David Marcus on January 22, 2022 8:24 AM at 8:24 am said:
  
  You only care about the details of the prior in the region where the likelihood is large. So, often you don’t need to worry too much about the prior or can just use a flat prior. But, this should be a conscious choice, not a choice hidden in a frequentist method that claims to not be using a prior.
  
  Reply ↓
  - Andrew on January 22, 2022 8:26 AM at 8:26 am said:
    
    David:
    
    Yes, we discuss this and related issues in our article, The prior can often only be understood in the context of the likelihood (which we unfortunately published in a so-called predatory journal, but that’s another story).
    
    Reply ↓
    - David Marcus on January 22, 2022 12:26 PM at 12:26 pm said:
      
      Andrew: Yes, people should read your papers!
Carlos Ungil on January 23, 2022 12:46 PM at 12:46 pm said:

> most Bayesians interpret probability in an epistemic way, which means that such a true parameter [of the data generating process] doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is.

About what does the Bayesian calculation say something then? Interpreting probability in an epistemic way doesn’t mean that the object of the knowledge doesn’t exist.

(If you mean that the true parameter doesn’t ever exist because a model will never be a correct representation of reality I don’t think a frequentist interprertation of probability will help much.)

> Believing in the existence of a true parameter implies a frequentist interpretation of probability.

Really? When people do a Bayesian analysis of the mass of a star, for example, they usually consider the existence of a star with some definite -but unknown- mass. The probability is interpreted as epistemic: it represents their knowledge about the [true] mass of the star.

Reply ↓
- Carlos Ungil on January 23, 2022 12:50 PM at 12:50 pm said:
  
  In case it’s not clear, that was a reply to Christian Hennig’s comment https://statmodeling.stat.columbia.edu/2022/01/20/more-on-p-values-etc-etc-etc/#comment-2043183
  
  Reply ↓
Carlos Ungil on January 23, 2022 2:42 PM at 2:42 pm said:

Phil:

> One envelope contains twice as much as the other. I pick one. I don’t open it; I have no idea how much money is inside. Is it not the case that the other envelope is equally likely to have half as much or twice as much?

It depends!

In your first comment you wrote: “You’ve chosen an envelope. It contains some amount of money, which we will call A. The other envelope is equally likely to contain 2A or A/2.”

The response in that case is NO. Conditional on a particular amount on the first envelope the other envelope is not equally likely to have half as much or twice as much (in general, it may happen for some particular values).

Now you write “If I chose the lower one then the other contains twice as much. If I choose the higher then the other contains half as much.”

The response in this case is YES. But when you express it like that it’s not difficult to see that, were the envelopes to contain for example $50 and $100, what you may gain doubling your $50 envelope is exactly the same amount ($50) that you may lose halving your $100 envelope.

Another completely different two envelopes paradox that you may appreciate:

There are two envelopes with some quantity of money that has been established using the same procedure: flip coins until you get tails and put in the envelope 2^(number of heads) dollars. For example, if you get tails on the first flip $1, if you get three times heads and then tails $8, etc.

You pick one envelope. If the envelope contains $x one can show that on average you’ll get more money if you switch. If you should switch whatever the content of the first envelope, you don’t even need to open it to decide that you should switch.

But the envelopes are exchangeable and there is no real reason to switch!

Reply ↓
- Phil on January 23, 2022 4:15 PM at 4:15 pm said:
  
  Carlos,
  Maybe it wasn’t clear to you that I understand the problem and it’s solution very well. I was trying to explain, to people who instantly understand the flaw in the argument for switching, why it is that some people do not instantly see the flaw. But perhaps this is not something that can be explained.
  
  Reply ↓
Carlos Ungil on January 23, 2022 6:04 PM at 6:04 pm said:

Christian:

> [probabilities can represent the researcher’s beliefs about some “true parameter”], as far as I’m concerned. However, if this is the true parameter of a true probability model, that probability model isn’t epistemic, and is something that de Finetti and many other Bayesians state does not exist.

I’m not sure about how to interpret that in the context of your previous claim that “most Bayesians interpret probability in an epistemic way, which means that such a true parameter doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is. Believing in the existence of a true parameter implies a frequentist interpretation of probability.”

Say, for example, that a Bayesian astrophysicist estimates the orbital parameters of an extrasolar planet orbiting a star from its effect on the star’s spectrum.

What of the following things do you disagree with?

a) The resulting posterior probabilities are interpreted in an epistemic way

b) The true eccentricity, period, longitude of periastron, etc. of the orbit do exist

c) The Bayesian calculation does represent a level of certainty about the eccentricity

d) The Bayesian calculation does say something about the eccentricity to the astrophisicist

e) The Bayesian calculation does say something about the eccentricity

Why would believing in the existence of a true eccentricity imply a frequentist interpretation of probability?

(I’d like to know what do _you_ mean, not what do you think that De Finnetti, or some other leading Bayesian, or many other Bayesians, would think about it. But, as a bonus question, what of those items do you think that most Bayesians would disagree with?)

Reply ↓
- Christian Hennig on January 23, 2022 8:02 PM at 8:02 pm said:
  
  @Carlos: “Why would believing in the existence of a true eccentricity imply a frequentist interpretation of probability?” I repeat myself but what implies a frequentist interpretation of probability is the belief that the true eccentricity is a parameter of a true data generating process modelled by a probability model (this may not be what your astrophysicist believes, and that’s fine by me). Otherwise the Bayesian astrophysicist needs to know what they mean, not me! If I had a chance to see their working and interview them about why they chose what they chose, I may have a chance to answer your questions but certainly not in this way. Personally I’m a pluralist and open to various different interpretations of probability including various flavours of Bayes. The whole discussion started from David claiming that the Bayesians deliver what the practitioners want as opposed to the frequentists, and I was stating that in my view what many practitioners want is a mix up of ideas that neither the frequentist nor the Bayesian can give them. I did not deny at any point that what Bayesians do makes some sense (where it does;-).
  
  Reply ↓
  - Carlos Ungil on January 24, 2022 7:28 AM at 7:28 am said:
    
    Ok, so if I understand it correctly
    
    a) when a scientist does Bayesian inference – as in having a model p(data|parameters) and a prior distribution p(parameters) which are combined to obtain a posterior distribution p(parameters|data)
    
    b) if there are parameters in the model that correspond to some real-world quantity – as it usually happens science
    
    c) then the analysis doesn’t qualify as Bayesian for you – because by [your] definition a Bayesian analysis cannot say anything about the true values of the parameters
    
    d) and [maybe] you don’t think that the probability p(parameters|data) has an epistemic interpretation – it’s unclear to me if that’s what you say and what would the interpretation of p(parameters|data) be in that case
    
    When Laplace produced a estimate of the mass of Saturn that was not a Bayesian analysis because a Bayesian analysis cannot say anything about the mass of Saturn.
    
    If it’s just a matter of labelling things as Bayesian or not Bayesian this distinctions are mostly inconsequential.
    
    But reading your comments I gather that it’s not just Bayesian calculations who cannot say anything about the true value of the quantities of interest. The conclusion would be that scientists want to say something about the external world but that’s something that no method of inference can give them.
    
    Reply ↓
    - Christian Hennig on January 24, 2022 8:10 AM at 8:10 am said:
      
      Not sure how long do we have this to go on, as I believe that I have made my point before. I can call many things Bayesian and for sure I have no problem calling an analysis Bayesian in which Bayesian computations involving priors occur. I’m not saying that what is frequentist is not Bayesian (frequentist referring to an interpretation of probability, Bayesian referring to a way to do inference based on probabilities – these are not per se incompatible). The trouble comes in because most Bayesians use an epistemic interpretation of probability, which is indeed not frequentist, and doesn’t locate the “modelling target” of a probability model in the real world. This (and the requirement to specify a prior that has epistemic interpretation) is what many practitioners are not happy with, which was my original point (note that I’m *not* saying that I personally am not happy with this – that’d be a different discussion).
      
      I also say once more that believing in the real existence of a certain natural constant a scientist may be interested in is weaker than believing in this constant being a parameter of a probability model which is meant to model data generation in the real world (and as such would be frequentist, not epistemic; though could in principle still be handled with Bayesian computation).
    - Daniel Lakeland on January 24, 2022 1:29 PM at 1:29 pm said:
      
      Christian, I think you’re making a mistake when you say “believing in this constant being a parameter of a probability model which is meant to model data generation in the real world” is a “frequentist concept”. It’s entirely reasonable to say that the “data generating process” in the world is gravitational accelerations etc combined with light propagation and bending in lenses and recording on cameras and etc such that “for all we know” if the true value of the parameter is x then the data we will record is “not far from y(x)” such that p(y* – y(x) | x) is a representation of how likely we believe that error would be a certain size (with y* being the measured value and y(x) the predicted). The “likely” aspect is entirely *in our heads* whereas the x is definitely located in the world, some real eccentricity of an orbit or whatever it is.
    - Christian Hennig on January 24, 2022 3:53 PM at 3:53 pm said:
      
      Daniel: What you describe is epistemic probability assuming that there is a true value of some constant, but the probability distribution around it is epistemic. That’s fine by me. What I meant, however, is that many practitioners I know want to interpret the probability model as modelling the data generating process, not their own state if mind.
      
      The subtlety from my point of view is this: Mathematical models are generally thought constructs and not “really true”. This applies to both frequentist and epistemic models (I’m simplifying here because there are certain non-frequentist interpretations of probability that are not epistemic either, however these are not very well known and not in wide use as far as my experience goes). The difference is that the frequentist models model data generating processes in the world, whereas epistemic models model a state of knowledge. Those who prefer epistemic models often argue that if “objective” probability doesn’t exist in the real world, we are only left with epistemic probability. But I think that this doesn’t really work as an argument because being an idealisation that doesn’t fully capture reality applies just as much to epistemic probability regarding “epistemic reality”.
      
      One problem in this discussion is that I am still with the original issue that was David Marcus’s claim that the frequentist doesn’t give the practitioner what they want, and what they actually want is something Bayesian. I have challenged that, so my point is about the thinking of practitioners, more precisely a certain group of them who tend to interpret a p-value as a probability for the null hypothesis being true. A weakness about both David’s and my claim is that neither of us has hard data about how the majority (?) of practitioners think, as I have to admit. However this was not meant by me as discussing my personal view of Bayesian statistics, and what I said about certain practitioners’ point of view should not be confused with my own.
    - Carlos Ungil on January 24, 2022 4:12 PM at 4:12 pm said:
      
      Christian,
      
      I can understand if you’re tired of repeating yourself. But honestly, despite the repetition I don’t see what “the trouble” is.
      
      If you’re right and most Bayesian practitioners have an issue with having a posterior probability that has an epistemic interpretation relative to a real-world quantity maybe some kind Bayesian soul in the audience can help understand why.
      
      As a toy example – that surely should be enough to discuss such foundational issues – inspired by the envelope switching problem let’s imagine that you let me pick a red or a blue envelope. You have put into one of them a high prize (two $10 notes, one $1 notes) and in the other a low price (two $1 notes, one $10 note). I can pick one envelope, reach inside and take one note out, and either keep the envelope or switch.
      
      I pick the red envelope. You put either $12 or $21 in it. You know what’s inside but I don’t.
      
      I have a model, with a parameter ‘z’ that admits two values: 0 if I got the low prize, 1 if I got the high prize. There is a “true” value that I don’t know.
      
      My prior for this parameter is p(z=0)=p(z=1)=0.5 which has an epistemic interpretation.
      
      My model for the dollar amount of the note I take out is p(x=1|z=0)=p(x=10|z=1)=2/3, p(x=10|z=0)=p(x=1|z=1)=1/3.
      
      If it’s a dollar, it makes more plausible that I got the envelope containing two dollar notes than the envelope containing only one. I should switch. My posterior is p(z=0|x=1)=2/3, p(z=1)=1/3
      
      If it’s ten dollars, my posterior is p(z=0|x=10)=1/3, p(z=1|x=10)=2/3 and I shouldn’t switch.
      
      These posterior probabilities, like the prior probabilities before, have an epistemic interpretation. They are about which envelope – high prize or low prize – I have. This parameter has a true value, unkown to me.
      
      I don’t see the conflict in all that, really.
    - Christian Hennig on January 24, 2022 4:29 PM at 4:29 pm said:
      
      “If you’re right and most Bayesian practitioners have an issue …” I’m *not* talking about “Bayesian practitioners”. I’m talking about the kind of practitioner that uses p-values and is prone to misinterpret them – that kind of practitioner that David claims would really want something Bayesian. As far as I know them (my knowledge is limited but for sure larger than zero), the thing is they don’t want epistemic probability, and they don’t want to specify a prior. One reason why they don’t want that is that they work on projects in which prior knowledge is by far not as clear and straightforward to formalise as in your example.
    - Carlos Ungil on January 24, 2022 4:56 PM at 4:56 pm said:
      
      I’m sorry if I said “Bayesian practitioners” where I should have said “Bayesians”. By the way, makes me wonder if when you said
      
      > most Bayesians use an epistemic interpretation of probability, which is indeed not frequentist, and doesn’t locate the “modelling target” of a probability model in the real world
      
      and
      
      > most Bayesians interpret probability in an epistemic way, which means that such a true parameter [of the data generating process] doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is.
      
      it does apply to “most Bayesian practitioners” and “most Bayesian theorists” or only to one of those (exhaustive?) groups.
      
      I was obviously not talking about the issue of priors or models but about the quotes above.
      
      As far as I can see, interpreting the probability – of the parameter z – in an epistemic way doesn’t mean that the true parameter – z=0 or z=1 – doesn’t exist, nor that the Bayesian calculation – p(z|x) = p(x|z)p(z)/p(x) – doesn’t say anything about it.
    - Christian Hennig on January 24, 2022 5:12 PM at 5:12 pm said:
      
      Carlos: I thought I addressed this already when replying to Daniel.
      I distinguish the following two things:
      1) There is a parameter in the (epistemic) Bayesian model that refers to something real. That might well be.
      2) The distribution parametrized by said parameter (i.e., the likelihood) refers to a real distribution. That is not epistemic then.
    - Carlos Ungil on January 24, 2022 5:19 PM at 5:19 pm said:
      
      I still don’t know what you mean.
      
      Is p(x|z) – the distribution of the dollar amount I fetch from the envelope – a “real distribution” or not?
      
      Has the posterior p(z|x) a epistemic interpretation or not?
    - Christian Hennig on January 24, 2022 5:26 PM at 5:26 pm said:
      
      In your example you can portray p(x|z) as a “real distribution” because you have set it up in such a way that you know it, so in fact you can choose your epistemic distribution as the real one. If you use an epistemic concept of probability, all parts of your model should be epistemic. Normally, using an epistemic concept of probability, p(x|z) should be epistemic, too. It models your knowledge/belief about the process, not the process itself. However, if you know the process very precisely, the two can hardly be told apart.
    - Carlos Ungil on January 24, 2022 7:06 PM at 7:06 pm said:
      
      > you have set it up in such a way that you know it, so in fact you can choose your epistemic distribution as the real one.
      
      Ok, so I got a Goldilocks likelihood that “works” when it really shouldn’t… How can I broke it and what happens then?
      
      Do you mean that I could write down and “know” a likelihood that is not “the real one”?
      
      Say that in fact the (z=0) $12 envelope has 12 $1 notes and the (z=1) $21 envelope has 21 $1 notes. My likelihood p(x|z) is then wrong, because in fact I can only fetch dollar bills from any of the envelopes. Upon observing x=1 I will calculate a posterior p(z=0|x=1)=2/3 and I will want to switch envelopes. And you will be laughing at me and my stupid model – especially if I’m trading down.
      
      However, I don’t see what does change fundamentally when a wrong model is used.
      
      My posterior probability p(z|x) still has the same epistemic interpretation. There is still the same true value of the parameter – independent on how bad the likelihood is. The Bayesian calculation still says something about the parameter – even though the inference is wrong because the model is wrong.
    - Christian Hennig on January 24, 2022 7:38 PM at 7:38 pm said:
      
      This is the last thing I’ll write today, and probably in this discussion. My point is that epistemic probability and its justification through coherence require that all parts of the model are epistemic, and that includes p(x|z). In fact, I don’t think it would create any problem for your analyses and the final interpretation of the posterior if you accepted that p(x|z) is epistemic as well. What you seem to try is to convince me that p(x|z) is instead a “real” distribution. Of course it is, in your example, because the example is artificial and you constructed it as a real distribution. As such it is not epistemic, but it doesn’t hurt your Bayesian analysis at all if you take for the required epistemic p(x|z} the “real” (in fact made up) one (whether what your Bayesian uses is correct or not is not the issue really; in order to give you an overall coherent epistemic approach, it’s got to be epistemic).
      
      If you talk about “real” probabilities, what do you mean by this (taking into account that de Finetti, Dawid and others hold that such a “real probability” does not exist, and in your example of course it exists by fiat only)? One possibility to give an answer is frequentist, but it seems you don’t want that.
    - Daniel Lakeland on January 24, 2022 7:54 PM at 7:54 pm said:
      
      I’m not sure what is meant by “real” probability distribution. But I am very sure I understand what a real **parameter** is. There is objectively some real actual number of dollars in the envelope, or eccentricity of the orbit, or temperature of the frying pan, or tensile strength of the steel, or number of people who did in fact vote for Biden in such and such a county, or fraction of children that would do better than X score on a test if they took it next week.
      
      When we say p(Observed_Thing | Real_Parameter) we are discussing epistemic knowledge we have about what would be observed when the Real_Parameter which is an actual fact about the real world takes on the given value.
      
      This knowledge we have induces P(Real_Parameter | Observed_Thing, BackgroundKnowledge) which is a **real** probability distribution in that it represents knowledge we have about a real fact about the world conditional on our knowledge we’re using to build the model.
      
      You seem to be using the “real” term to mean **frequency distribution under repeated experimentation** but without actually admitting it, and then saying that Bayes can’t give us “real” distributions because under repeated experimentation the Real_Parameter is just a fixed fact about the world, and hence has no frequency distribution. That seems to me to be just playing with the meaning of words without being helpful.
    - Andrew on January 24, 2022 8:32 PM at 8:32 pm said:
      
      Daniel:
      
      Yeah, I thought about this a lot in grad school when I heard about the hardcore all-is-prediction attitude toward statistics. The way I put it is that “parameters” are what generalize from one problem to the next. That’s one reason I get irritated at computer scientists who’ve talked about this big new idea called “transportability”; it’s just what we would call mathematical or statistical modeling.
    - Carlos Ungil on January 25, 2022 3:10 AM at 3:10 am said:
      
      Christianm
      
      For the record, what I’m trying to do since the beginning is to make sense of your claim that “most Bayesians interpret probability in an epistemic way, which means that such a true parameter doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is.”
      
      > I don’t think it would create any problem for your analyses and the final interpretation of the posterior if you accepted that p(x|z) is epistemic as well.
      
      I’ve neither accepted nor rejected that, what I’m trying to understand is what do you mean and how is it relevant. I thought that the epistemic interpretation was the source of the “trouble”, anyway.
      
      > What you seem to try is to convince me that p(x|z) is instead a “real” distribution.
      
      Am I? You’re the one throwing the “real” labels around on models and probability distributions, not me. If anything, I’m saying that the envelopes and their contents and the exoplanets and their orbits are real.
      
      > Of course it is, in your example, because the example is artificial and you constructed it as a real distribution.
      
      That’s what I tried to break it later, to make it less “real” for you, whatever that means, and see if that sheds any light on the aforementioned quote.
      
      > As such it is not epistemic, but it doesn’t hurt your Bayesian analysis at all if you take for the required epistemic p(x|z} the “real” (in fact made up) one (whether what your Bayesian uses is correct or not is not the issue really; in order to give you an overall coherent epistemic approach, it’s got to be epistemic).
      
      See, I’m already lost here. How is not epistemic? (Based or not on the right envelope contents.) How is p(moon positons|mass of Saturn) not epistemic? How is p(spectrum|orbital parameters) not epistemic?
      
      > If you talk about “real” probabilities,
      
      You talk about “real” probabilities. I just try to follow what you say.
    - Christian Hennig on January 25, 2022 5:28 AM at 5:28 am said:
      
      The key difference between epistemic and frequentist probability is, in my view (which is in line with much that can be found in the literature), what is the target of modelling by a probability model. Epistemic probability models a state of knowledge, frequentist probability models a data generating process in the real world. (Obviously, epistemic probability can model knowledge about such a process, and I admit the term “real” on its own is very ambiguous, as of course the modelled knowledge may well be about something real. I’m also simplifying, as mentioned earlier, because there are “aleatory” interpretations of probability that are not frequentist.)
      
      I don’t personally have a problem with either, and I have written earlier that Bayesian inference is compatible with both of them. However what I believe is when using probability models we should decide what the probabilities mean, and they should be either epistemic or frequentist/aleatory, but not both in the same model, because any justification of the probability axioms is based on either relative frequencies or epistemic coherence, not both. It is fine by me though if a researcher uses epistemic probability for one problem and frequentist probability for another, I’m pluralist in this respect.
      
      Other than that, I’m still not sure whether you confuse the attitude of certain practitioners that I have tried too portray with my own. Some of the objections raised seem to refer to what I have said I believe what these practitioners want (which I can hardly defend, as my original point was that these practitioners want something that neither the Bayesians nor the frequentists can give them), rather than to what I think myself.
    - Christian Hennig on January 25, 2022 5:54 AM at 5:54 am said:
      
      Maybe I should write another posting that explains clearer my original point from which all this started, as of course the length of this discussion may have to do with me not explaining well…
      
      David Marcus wrote that “what people really want is not what the frequentist is calculating, but what the Bayesian is calculating”, meaning the probability that a certain parameter is in a certain place, rather than a probability for an event assuming a null hypothesis. According to my experience the problem is this: These practitioners (of course without empirical research I can’t know whether they are in fact the same David was referring to) want to interpret probabilities in a frequentist and not epistemic way, and are particularly wary of involving a prior that doesn’t have a frequentist meaning (i.e., can be linked to some observable frequencies) and can, in their view, hardly be justified without strong subjective impact (note that I am not claiming, personally, that there is no subjective impact in a frequentist analysis – it’s there for sure, but better hidden). Yet they’d be happy, as David stated, to have as output a probability distribution over the parameter. They want the omelette without breaking the eggs.
      
      I sometimes read on this blog, “the frequentists had all the time to teach their approach correctly, why is it that people still misinterpret it?” There are of course reasons for this. However, also the Bayesians now had quite some time to convince the practitioners to go Bayesian, and there are reasons, too, why they don’t succeed, or let’s say, at least not as much as they’d like to.
      
      This really shows the appeal of Fisher’s good old fiducial probabilities, but alas, people are not happy with them either (even though they make a bit of a comeback these days). I think that “giving the practitioners what they want” should not be our job, at least not if the people want something that is not to be had.
    - Carlos Ungil on January 25, 2022 6:08 AM at 6:08 am said:
      
      > I’m still not sure whether you confuse the attitude of certain practitioners that I have tried too portray with my own.
      
      Maybe I’ve misunderstood you all along. I thought that the quote in my previous comment represented your own position about the consequences of adopting an epistemic interpretation, as did the following:
      
      > Using an epistemic interpretation of probability (which is Bayesian mainstream as far as I know), probabilities are *not* about any “true parameter”
      
      But now you seem to accept that using an epistemic interpretation of probability, probabilities may be about a “true parameter” in the sense that they may be about the mass of Saturn and the mass of Saturn may be a parameter in the model and Saturn has a true mass.
      
      Or maybe you don’t agree, it’s hard for me to tell. You didn’t answer the questions in my previous comment, I’m not sure if you find the use of the p(body positions|mass of Saturn) model/likelihood problematic somehow.
    - Keith O'Rourke on January 25, 2022 7:22 AM at 7:22 am said:
      
      Carlos:
      
      “the mass of Saturn may be a parameter in the model and Saturn has a true mass”
      Well taking mass at a given point in time – yes – but the model with Saturn as a parameter is a just fallible representation of Saturn with a parameter that attempts to focus on it,s mass.
      
      In may/most published Bayesian analyses the assumed models are more often than not a shaky representation of anything in this world. I think that is Christian’s point.
    - David Marcus on January 25, 2022 9:34 AM at 9:34 am said:
      
      I’m probably having as hard a time following this as everyone else, but maybe Christian is pointing out that most users of statistics don’t understand that probability has two uses:
      
      1. Modeling the real world
      
      2. Modeling uncertainty
      
      I will certainly grant that such users will have a hard time explaining what they want, and so ask for something that is impossible. But, the fact that (many) statisticians and most textbooks have been pretending that probability is only used for #1 (or saying that it should only be used for that) is not helping.
      
      The solution is not to continue trying to shoehorn statistics into a framework where we don’t do #2. The solution is for the supposedly knowledgeable professionals to point out that not only is there nothing wrong with #2, but we need it.
      
      If you ask a user what they want, they will almost certainly say they want to know what we would call the posterior of the parameter. We shouldn’t be tricking them by saying, “If you would like this thing [which sounds very much to them like what they asked for], we can skip part of our discussion of the model.”
    - Christian Hennig on January 25, 2022 9:57 AM at 9:57 am said:
      
      Carlos:
      >> Using an epistemic interpretation of probability (which is Bayesian mainstream as far as I know), probabilities are *not* about any “true parameter”
      
      That sentence was probably worded in a somewhat misleading way; I meant “parameter of a probability model” rather than real quantity, see Keith’s comment (and what I had written in the meantime).
    - Carlos Ungil on January 25, 2022 10:19 AM at 10:19 am said:
      
      Keith,
      
      > the model with Saturn as a parameter is a just fallible representation of Saturn with a parameter that attempts to focus on it,s mass.
      
      What does it mean? The model takes Saturn’s mass and does Newtonian things on it or whatever. Maybe the model is wrong, but why couldn’t the “mass of Saturn” in the model by identified with, you know, the mass of Saturn? How can it be different in a meaningful way, which cannot be solved by redefining it to make the difference go away?
      
      Maybe your point is that nothing is real. The whole discussion about statistical methods is kind of pointless then.
    - Christian Hennig on January 25, 2022 10:44 AM at 10:44 am said:
      
      “I meant “parameter of a probability model” rather than real quantity”
      
      It may be that this distinction is what you’re having trouble with. A parameter of a probability distribution is really only defined within that distribution. Let’s say we’re trying to estimate the mass of Saturn (assuming that this really exists, at a certain point in time at least). Let’s say you expect measurements to scatter around it normally distributed with a parameter mu (for which you specify a prior), and you identify, by interpretation, the mu with the mass of Saturn. Note however that this identification cannot be taken for granted, because unknown to you there may be a systematic bias in the measurements and they may be centered in fact around the mass of Saturn plus some epsilon. This does *not* make invalid your epistemic model, as the model correctly specifies your belief about where observations are to be expected, and it may work correctly predicting future observations, as far as these are subject to the same bias (according to de Finetti and others, models have to be cashed out using predictions of observables; the true mass of Saturn is not directly observable). The mu is defined by your expectations, not by the interpretative link with the mass of Saturn.
      
      This problem is obviously the same for the frequentist who models the observations as being generated by a normal distribution with mean mu; interpreting the mu to be the true mass of Saturn is just as wrong as if the Bayesian does it. What is different is that if the observed distribution turns out to look skew or heavy tailed, the frequentist model will be falsified (as the frequentist models what they think is the “real distribution”), whereas your epistemic model was still a correct model of your expectations.
      
      Regarding “nothing is real”, by the way, we are used to take some quantities for real that are not directly observable. Somebody could argue that they are not “really real”, and that’s fair enough as a philosophical point of view, but ultimately to what extent they are real or not cannot be determined by observation, and in the meantime concepts that have clear effect on our perceived reality and can be reliably used (even if only observed with irreducible uncertainty) are real enough for me.
    - Keith O'Rourke on January 25, 2022 12:18 PM at 12:18 pm said:
      
      Agree with Christian but would add labelling it epistemic may be deflating the concern.
      
      I believe a thoughtful scientist would anticipate (should anticipate) their model changing at some point in the future. Now what was the parameter in the old model referring to?
      
      It’s like this is my prior and data generating model (or just data generating model) so it’s my model based epistemology so I don’t need to worry it’s not true. Like those Bayesian of yesteryear who would argue – that it is my prior so I don’t have to check it.
    - Christian Hennig on January 25, 2022 12:32 PM at 12:32 pm said:
      
      @Keith: Your thoughtful scientist may not be enough of a coherent Bayesian to shrink away from changing their prior in ways other than following Bayes’s Theorem in case the data look too incompatible with their earlier belief. Good for them!
    - David Marcus on January 26, 2022 2:40 PM at 2:40 pm said:
      
      Christian wrote:
      
      > Let’s say we’re trying to estimate the mass of Saturn (assuming that this
      > really exists
      
      I’m sorry, but you lost me here. We need to “assume” that Saturn has a mass
      to do statistics?
Anonymous on January 23, 2022 9:39 PM at 9:39 pm said:

While some doubt that philosophers who’ve never made a statistical inference have much to contribute, I believe they can be deeply profound. Their erudite contributions are as invaluable as they are subtle. Sometimes they’re able to see the whole forest, where mere statisticians just see trees:

https://twitter.com/learnfromerror/status/1484760189555335171

Reply ↓
- Mikhail Shubin on January 24, 2022 7:00 PM at 7:00 pm said:
  
  It never ceases to surprise how twitter symbol limit dumbs down every discussion, no matter how clever or stupid it was.
  
  Reply ↓
  - Andrew on January 24, 2022 8:36 PM at 8:36 pm said:
    
    Mikhail:
    
    I agree with Deborah Mayo when she writes, “Nobody has ever shown why scientists would want to know that their theory has a posterior probability of, say .7, .8. Meaningless.” Cosma Shalizi and I discuss this in our article from 2012.
    
    But lots of other things in that linked twitter discussion are pretty bad. I really don’t like the whole twitter thing, which seems like 10% thought, 90% attitude. And, jeez, yeah, the guy on the thread who wrote, “The problem of priors is usually unsolvable” . . . I hate that kind of thing. Sure—everything’s unsolvable. Go tell it to the people who fit linear and logistic regressions all day!
    
    Reply ↓
Carlos Ungil on January 25, 2022 12:12 PM at 12:12 pm said:

Christian, [I’m tired of scrolling up to find the “Reply” button, I’ll continue here]

> Let’s say you expect measurements to scatter around it normally distributed with a parameter mu (for which you specify a prior), and you identify, by interpretation, the mu with the mass of Saturn.

My model is not p(observations|mu). My model is p(observations|mass of Saturn). The mass of Saturn is a parameter in the model, for which I specify a prior.

I’m not identifying, by interpretation, the mass of Saturn with the mass of Saturn. The mass of Saturn is the mass of Saturn.

My epistemic model specifies my belief about where observations are to be expected as a function of the mass of Saturn.

Reply ↓
- Christian Hennig on January 25, 2022 12:24 PM at 12:24 pm said:
  
  If you don’t see the difference between a parameter in a formal model and something physically real like the mass of Saturn, we won’t come together I’m afraid.
  
  Reply ↓
  - Daniel Lakeland on January 25, 2022 2:22 PM at 2:22 pm said:
    
    Christian, hmm… there is a difference between “the amount of money in my bank account” and “the dollars the bank gives me when I withdraw all that money”… One is a number stored in a computer memory, the other is a bunch of paper bills I put in a suitcase… But there *is* a 1-1 correspondence between the numerical quantity stored in the computer, and the count of the dollar bills I put in the suitcase.
    
    The map is not the territory, but a good map has a 1-1 correspondence modulo some small error between quantities measured off the map (distances, areas, elevation changes, etc), and quantities that would be measured by a surveyor with survey instruments in the field.
    
    Reply ↓
  - Carlos Ungil on January 25, 2022 3:53 PM at 3:53 pm said:
    
    You may be right, we may not come together.
    
    If you can’t comprehend an epistemic model involving something physically real like “if the mass of Saturn is X I expect to see Y” or “if this mummy is from the Old Kingdom I expect to see carbon isotope ratios more like this but if it’s from the New Kingdom I expect to see carbon isotope more like that”… that’s your loss, not mine.
    
    Reply ↓
    - Carlos Ungil on January 25, 2022 4:43 PM at 4:43 pm said:
      
      Or maybe you just object to calling that a “model”?
      
      Maybe we all agree on the substance, which is that a Bayesian
      
      may use a probability distribution p(mass of Saturn) to represent the degree of plausibility assigned to different values of the physically real mass of the planet, and
      
      may use the conditional probability distribution p(observations|mass of Saturn) to represent the expected result of some observations if the mass of Saturn had a particular value, and
      
      may calculate a new probability distribution p(mass of Saturn|observations) that represents the degree of plausibility assigned to different values of the physically real mass of the planet based on the result of the observations and the probability distributions mentioned before.
      
      [Just in case I also avoided saying “likelihood” and “parameter”.]
    - Daniel Lakeland on January 25, 2022 5:22 PM at 5:22 pm said:
      
      Carlos, my impression of the controversy is that he doesn’t consider the p(observations|mass of Saturn) used by the Bayesian to be a “real” probability distribution. that is it’s not part of a “generative model” of how the real world works, it’s a part of an “epistemic model” of what we *know* about how the real world works, and he thinks that “practitioners” don’t want that.
    - Daniel Lakeland on January 25, 2022 5:26 PM at 5:26 pm said:
      
      I don’t want to put too many words in his mouth, but if that’s the case, I’d just say that it appears the world “real” is sneaking in Frequentism, that is it’s “real” if the distribution can be measured by repeated observation and then the shape of the histogram compared to the model and the model is falsified if it doesn’t match sufficiently well… something like that.
      
      The Bayesian answer to this is to make the frequency distribution a thing which itself has parameters which have epistemic uncertainty about them (maybe for example a gaussian mixture model, GMM)… The Bayesian then zeros in on “what we know” about the frequency distribution. The frequency distribution is then “real” to someone like Christian, but the Bayesian probability is still epistemic in that it measures plausibility of the GMM parameters.
    - Christian Hennig on January 25, 2022 6:05 PM at 6:05 pm said:
      
      You seem to think that I use the term “real” to portray the frequentist approach as somehow superior, but I don’t. As I wrote before, I agree with Bayesians such as de Finetti that a really true “objective” probability distribution does not exist. A frequentist model of such a distribution is not “the real thing” but an idealisation, and whether it’s any good is determined by its use – the same holds for an epistemic distribution used by a Bayesian. The difference is the *target* of modelling, but I have tried to explain this already, more than once. Also my discussion was in no way meant to *defend* those practitioners (of which I am well aware that they are only a subset of size unknown to me of all practitioners) who do not want probabilities to be epistemic, yet want a probability for “really true” parameter X to lie in a certain set A without having to specify a prior.
      
      Still, a model is a model, it is a thought construct and has to be connected to reality by interpretation of the modeller, rather than reality entering it directly. This seems so trivial to me (and applies by the way to frequentist as well as epistemic modelling) that I don’t quite get why you seemingly dispute it. So ultimately there really seems to be a big misunderstanding here as I find myself explaining again and again something that seems crystal clear to me, whereas you obviously feel that I don’t get what seems crystal clear to you. Sigh. Maybe we let it sleep for now.
    - Daniel Lakeland on January 25, 2022 7:02 PM at 7:02 pm said:
      
      So far I don’t disagree with you I just don’t understand what is your actual claim. The part Carlos and I seem to be unable to understand is what is it that the Bayesian model fails to give to these “practitioners” you are speaking for.
      
      As near as I can tell your claim is something like:
      
      1) Practitioners want a “real” p(data | fact about the world) (but you haven’t defined “real” yet you claim that Bayes can’t do this even in principle)
      
      2) Practitioners want an epistemic p(fact about the world | data) but don’t want to specify a prior (which I more or less agree with but is a point that I think practitioners can be educated on)
      
      I think there was a 3rd claim but I can’t remember what it was
    - Christian Hennig on January 25, 2022 7:29 PM at 7:29 pm said:
      
      @Daniel: As far as I can say, these practitioners are not happy with probabilities that model their knowledge rather than what really goes on. Now you can say that the likelihood in a Bayesian (epistemic) model is not worse connected to “what really goes on” than a frequentist model, and I won’t dispute that. What really puts these practitioners off is the requirement to specify a prior, and the insight, connected to it, that their results will depend on how they do that, even though they have no idea how this prior can be related to anything observable. Also, they want a p(fact about the world | data), but they don’t want it epistemic. They want it somehow “objective”.
      
      Now you can say that a prior can be related to observables in various ways if we just sit down with the practitioner and go through what they already know about their subject. Which is fair enough as an existence claim – there exists information that can somehow help to specify a prior. But normally this information will be very far from determining the prior, and the prior will necessarily have elements that are not very well motivated. (For sure in the literature very weakly or not at all motivated priors are endemic.) So it will look rather arbitrary to these practitioners, and they may not want it.
      
      You may say that anyway it will help them, conclusions may be unaffected by what’s unmotivated about the prior if it’s otherwise chosen according to some general standards (?) or rather that it can be chosen so that results are just so sensitive to the data that ultimately it’s not a big deal. Which may well be. Personally I don’t use Bayes much but I’m not against it. These people are more against it than I am, but if you want to spread the use of Bayesian analysis it’s your job to convince them, not mine. And, to come back to my original statement, I don’t think telling them “we give you what you really want” is the right way of doing it. They may say, “thanks a lot but please leave to us to decide what we really want”, and if they don’t want a prior, you’ll have a hard time selling it in this way. (And although I don’t generally agree with them and I do give it to the Bayesians that you need a prior to get a probability distribution for the parameter, I *can* understand very well why they don’t want a prior, and personally I am keen on avoiding it unless it is clear to me what information that should be brought in can be brought in by using it.)
    - Carlos Ungil on January 25, 2022 7:38 PM at 7:38 pm said:
      
      > Still, a model is a model, it is a thought construct and has to be connected to reality by interpretation of the modeller, rather than reality entering it directly.
      
      Do you consider p(observations|mass of Saturn) a model?
      
      If not, that was the source of the misundertanding for me. All the discussion about epistemic models, frequentist concepts, true parameters and real probability distributions has been just a big distraction.
      
      > This seems so trivial to me (and applies by the way to frequentist as well as epistemic modelling) that I don’t quite get why you seemingly dispute it.
      
      Maybe I just need to change my language in this context when refering to that kind of ‘pseudo-model’.
      
      Still, if we agree that a Bayesian can have an epistemic probability distribution for a physically real thing [ p(mass of Saturn) ] the only way to do Bayesian things on it with data seems to be using a ‘pseudo-likelihood’ [ p(data|mass of Saturn) ] that conditions on that physically-real-thing-representing ‘pseudo-parameter’ [ mass of Saturn ].
    - Anoneuoid on January 25, 2022 7:51 PM at 7:51 pm said:
      
      Has any moon/mars mission ever weighed something on earth with a scale then compared the result to one using the same scale on the moon/planet?
    - Christian Hennig on January 25, 2022 8:36 PM at 8:36 pm said:
      
      > Do you consider p(observations|mass of Saturn) a model?
      
      As it stands it’s a strange mish-mash of words and formal symbols (I suspect that I may consider a model what you really mean, but you haven’t presented this in a way that I can be sure).
    - Daniel Lakeland on January 25, 2022 10:35 PM at 10:35 pm said:
      
      Christian, I think most practitioners wouldn’t actually complain if I told them to use a prior “uniform on the range of 64 bit IEEE floats” in most cases.
      
      Of course I’d never tell them to do that, I believe in using real information, but notice that they’re already assuming this prior, because they’re using float64 for their calculations. The assumption I’m asking them to swallow is no more than the one they’ve already swallowed when they use float64s
    - Keith O'Rourke on January 26, 2022 7:31 AM at 7:31 am said:
      
      Again getting too long and about to end.
      
      Daniel: “we *know* about how the real world works, and he thinks that “practitioners” don’t want that.” – That is the best us humans can get but it is not (exactly) the probability of something real. It may or may not be close but you don’t want to encourage the believe that it is.
      
      This is subtle and some are now arguing just a hard to avoid hazard of using representations (e.g. language) where representations are being used that get taken as what is represented. An animal example would be animal taking their reflection in a mirror (a representation) as a real animal.
    - Carlos Ungil on January 26, 2022 6:18 PM at 6:18 pm said:
      
      This thread is probably – in a epistemic sense or in a frequentist sense? – dead by now, but addressing the strangeness of that p(observations|mass of Saturn) mish-mash of words and formal symbols looks like a fun pastime. Even if I don’t expect anyone to read it.
      
      The p( | ) pattern of symbols denotes a “probability distribution”, more precisely a “conditional probability distribution” because of the | inside.
      
      p(A) is a function that assigns to a proposition A a “probability” satisfying a few axioms. Here the probability is a measure of the degree of plausibility that we assign to the statement. For example, we can consider propositions relative to the mass of an orange in front of us.
      
      We assign to p(“the mass of the orange is less than 10g”) the value 0, because it’s completely unplausible. We assign to the complement p(“the mass of the orange is at least 10g”) the value 1, because we have no doubt about that. We can consider propositions like “the mass of the orange is more than 300g but not more than 400g” and the probabilities for non-overlapping intervals that cover all the plausible values will sum to 1. We can make the intervals smaller “between 300g and 301g”, etc. and in the limit we get an infinite number of propositions like “the mass of the orange is 300g”. We can represent that continuous probability distribution over propositions “the mass of the orange is …” as p(mass of orange). The variable is here a positive number: the mass of the orange in grams. The integral of p(mass of orange) is 1.
      
      We can consider another distribution representing the price that is plausible for the orange. p(price of orange) using the notation above. And we can consider the joint distribution with propositions like “the mass is between 300g and 301g and the price is between 40c and 41c” or as a bi-variate continuous distribution p(mass, price).
      
      The conditional probability p(mass|price) is convenient notation for the plausibility of different values of the mass when the price is fixed at a particular value. Knowing that the orange is expensive may make higher masses more plausible and vice versa. The notation p(mass) is reserved for the marginal probability that just ignores the prices – it is the same p(mass) distribution considered above.
      
      p(observations|mass of Saturn) is a probability distribution that represents how plausible would be find each potential observation – more about that below – if we knew that the mass of Saturn had some fixed value. By combining it with p(mass of Saturn), the plausibility we assign to different masses, we can obtain the joint distribution p(observations, mass of Saturn).
      
      To be clear p(observations|mass of Saturn) is a probability distribution for the observations only – the mass of Saturn is a fixed “parameter” and for each value of the “parameter” we have a different probability distribution for the observations. p(mass of Saturn) [= prior(mass of Saturn)] is a probability distribution representing the plausibility we assign to propositions like “the mass of Saturn is …”.
      
      Now, we can take the ACTUAL value of the observation and plug it in the joint distribution. p(observations=ACTUAL, mass of Saturn) is now a function of the mass of Saturn only. Normalizing we have p(mass of Saturn|observations=ACTUAL) [= posterior(mass of Saturn)] which is a probability distribution representing the plausibility we assign now to propositions like “the mass of Saturn is …”.
      
      If we are worried about the mass of Saturn not being constant through the procedure – and thus not being a physically real thing – that concern can be solved, for example, having the prior and posterior p(mass of Saturn) represent “the mass of Saturn at origin was …” with some arbitrary origin and use, as a bridge into p(observations|mass of Saturn at relevant time), a conditional distribution p(mass of Saturn at relevant time|mass of Saturn at origin). Of course, it will be a delta function for all practical purposes.
      
      About the observations and their p(observations|mass of Saturn) probability distribution conditional on the mass of Saturn: Kepler’s third law gives a relation between the mass of a planet and the period and radius of the orbits of its satellites – if we can ignore their mass. The ratio period^2/radius^3 is constant. The probability distribution could be then be range of values that we expect to see for that ratio – for a fixed value of the mass of Saturn – or it could be a joint probability distribution for a more complicated function of the orbital parameters, or relative positions over time… It could be anything that we want to consider “observations” really.
      
      And it would take into account – either explicitly in the calculation or with a reasonable fudge factor – our uncertainty about the gravitational constant, the resolution of our telescope and other instruments, the accuracy of the approximations, the potential effects from the satellite’s mass, or other objects in the Saturn system, or in the Solar system, relativistic corrections, etc. We could also have additional “parameters” with their own probability distributions, in addition to the mass of Saturn. That doesn’t change anything fundamentally anyway – we can always recover the marginal p(observations|mass of Saturn).
      
      The point is that we have a p(observations|mass of Saturn) that we are going to use in the Bayesian calculation. Can it be called a “model”?
    - Christian Hennig on January 26, 2022 7:27 PM at 7:27 pm said:
      
      @Carlos: Now this was a nice one. I’d call it model (now that you have explained how all its elements are connected to reality). However I’d say that you could as well write p(a|b) and *interpret* a as the observations and b as the mass of Saturn. That’s not part of the mathematical model itself but rather how you connect it to reality. It’s only a model by means of having such a connection to reality, however there isn’t really the real mass of Saturn in there, just some letters that you interpret in this way, and this doesn’t change if you write “mass of Saturn” rather than b.
    - Carlos Ungil on January 27, 2022 3:58 PM at 3:58 pm said:
      
      Christian, thanks for take the time to read what I wrote and reply.
      
      I think that what people often want is p(X) – where X is a set of alternative facts about the real world. And they want infer it from data observed – which is coming from the real world, we get one of the potential outcomes D.
      
      Bayesian inference can give people that – but not if they don’t want p(X) to be epistemic (what else could it be, if X is something real that you don’t know precisely?) and reject it because it’s not “objective” (what would that mean?).
      
      Starting with some p(X), some model p(D|X) and some observed data we end up with something about the real world. The use that we make of p(D|X), inserting the observed value of D on one side and the epistemic probability P(X) on the other, is what connects it to reality. It’s not about how we interpret the names of the parameters in the likelihood, it’s about what we do with them. Of course the model has to be appropriate to what we are trying to model. We agree using any random p(a|b) like that doesn’t make it a relevant, meaningful or useful model.
      
      Anyway, if a practitioner wants p(real-world fact) rejecting Bayesian methods because there’s too much uncertainty in them and using frequentist methods instead is a bit like asking “what time is it?” and answering “yesterday when I went to the church it was noon!”.
    - David Marcus on January 27, 2022 5:09 PM at 5:09 pm said:
      
      Christian wrote:
      
      > Personally I don’t use Bayes much but I’m not against it. These people
      > are more against it than I am,
      
      You shouldn’t let the patients do the surgery. If statistics was easy, the Greeks would have figured it out.
    - David Marcus on January 27, 2022 5:10 PM at 5:10 pm said:
      
      I of course meant the ancient Greeks.
    - David Marcus on January 28, 2022 11:17 AM at 11:17 am said:
      
      Christian: A practitioner comes to you and shows you their data. It is a sequence of 0’s and 1’s that is the result of some experiment. They ask you to help them analyze it. What do you ask them, and what do you do?
    - Christian Hennig on January 28, 2022 12:10 PM at 12:10 pm said:
      
      @David: You are not keen on letting this go, are you? ;-) Obviously I ask them to explain their experiment and what they want to know, everything else follows from there.
    - David Marcus on January 29, 2022 9:09 AM at 9:09 am said:
      
      Christian: Do you ask them why they stopped collecting data? If not, what analyses do you do? What if they say it was an ESP experiment, a lady tasting tea experiment, or a music expert identifying composers from bits of music experiment? Do you do the same analysis in each case? If not, why not?
    - Christian Hennig on January 29, 2022 10:21 AM at 10:21 am said:
      
      @David: For sure whatever I do depends on what they tell me, so no, of course I’m not always doing the same analysis. (Who does? It’s a rather silly question.) Also I don’t work for everyone, not sure whether I’d be up for tea tasting or ESP (composer identification quite certainly yes, I’ve done quite a bit of classification – how this can be done using simple 0-1 data only is beyond me though). I have occasionally told people that it would have been a better idea to consult a statistician before data collection and that at this point there’s nothing that can be done that I could convince myself of making sense. Obviously they’re free to go elsewhere.
      
      If you read “Beyond subjective and objective in statistics” https://rss.onlinelibrary.wiley.com/doi/10.1111/rssa.12276 you’ll see clearly that I believe that good statistical analysis is dependent on all kinds of background information including the aim of analysis.
      
      If you go on asking, you can increase chances of getting a response by giving some context to your questions like you could tell me what you want to achieve with them, and how they’re connected to the topic or things I have written earlier. I’m up for an interesting discussion but not so much an exam style interview.
    - David Marcus on January 29, 2022 7:40 PM at 7:40 pm said:
      
      Christian: So you do ask them for prior information. You just don’t call it that.
    - Carlos Ungil on January 30, 2022 7:16 AM at 7:16 am said:
      
      Christian,
      
      following our exchange I’m not sure where do you stand regarding the things that you wrote before.
      
      Can this quote
      
      > I also find dubious the claim that Bayesians can give people properly what they want, because I believe people want to know what the true parameter of the data generating process is (or if you can’t know it precisely, a probability for that), but most Bayesians interpret probability in an epistemic way, which means that such a true parameter doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is. Believing in the existence of a true parameter implies a frequentist interpretation of probability.
      
      be interpreted as
      
      > I also find dubious the claim that Bayesians can give astrophysicists looking at the mass of Saturn properly what they want, because I believe they want to know what the true weight of Saturn is (or if you can’t know it precisely, a probability for that), but most Bayesians interpret probability in an epistemic way, which means that such a true weight of Saturn doesn’t exist, for which reason the Bayesian calculation doesn’t say anything about where it is. Believing in the existence of a true weight of Saturn implies a frequentist interpretation of probability.
      
      in the particular case where
      
      parameter of the data generating process = mass of Saturn, parameter of the model p(observations|mass of Saturn)
      
      or not? Is that a fair view of your position?
    - Christian Hennig on January 30, 2022 7:52 AM at 7:52 am said:
      
      @Carlos: My claim concerns a number of people who are prone to misinterpret p-values in a certain way. I have only once collaborated with an astrophysicist, and there were no p-values involved. So I have no idea. I don’t know enough about how astrophysicists think. (I made it clear enough before that the weakness of my whole claim is that I have no proper systematical empirical knowledge about to how many people and in what subject areas it applies.)
      
      So really, you don’t need to believe my maybe overgeneralised claims about certain people – go out and convince them to become Bayesian if you can!
      
      @David: I how you see that I responded to you further down, as I clicked “reply” in the wrong place.
    - Carlos Ungil on January 30, 2022 2:55 PM at 2:55 pm said:
      
      > My claim concerns a number of people who are prone to misinterpret p-values in a certain way. […] So really, you don’t need to believe my maybe overgeneralised claims about certain people – go out and convince them to become Bayesian if you can!
      
      My humble contribution to the Bayesian cause is trying to correct what seem to be wrong / misleading / overgeneralised claims regarding what can or cannot do be done with Bayesian methods and an epsitemic interpretation of probability.
      
      It was not clear to me that you were ascribing the claim “Believing in the existence of a true parameter implies a frequentist interpretation of probability.” to the people who misinterpret p-values in a certain way – if that was indeed the intention.
      
      In any case, if your point was that some people will reject the applicability of Bayesian methods for scientific inference because they go outside of the frequentist paradigm you have made it clear.
    - Christian Hennig on January 30, 2022 6:30 PM at 6:30 pm said:
      
      Carlos: This is tiring. If you try to make my statement a statement about an astrophysicist, I’ve got to say what I just said. The “Believing in the existence of a true parameter implies a frequentist interpretation of probability” was discussed earlier, and for long. Groundhog day again. Obviously you can believe in the existence of the true mass of Saturn without frequentism (or rather more generally an aleatory interpretation of reality), however if you think that it’s the true parameter of a model that models a true data generating process (rather than someone’s knowledge about the true quantity in question), that’s pretty much frequentism in a nutshell.
    - Christian Hennig on January 30, 2022 6:39 PM at 6:39 pm said:
      
      By the way, the semantic issue here seems to be the use of the word “parameter” rather than “model”. I hadn’t called the true mass of Saturn “parameter” as a parameter in my view is defined within a model, by means of the model. I get that you talk about a Bayesian model in which you interpret the parameter as the true mass of Saturn, which is fine by me. So I grant you that calling the true mass of Saturn a parameter is incompatible with my statement.
    - Carlos Ungil on January 31, 2022 3:15 AM at 3:15 am said:
      
      > By the way, the semantic issue here seems to be the use of the word “parameter” rather than “model”. I hadn’t called the true mass of Saturn “parameter” as a parameter in my view is defined within a model, by means of the model.
      
      >>> The point is that we have a p(observations|mass of Saturn) that we are going to use in the Bayesian calculation. Can it be called a “model”?
      
      >> I’d call it model (now that you have explained how all its elements are connected to reality).
      
      A general definition of statistical model is a family of probability distributions for some variables – here ‘observations’ – with members of the family corresponding to different values of other variables – here ‘mass of Saturn’ – called parameters.
      
      Now, if that’s not what you’d call a parameter that’s fine. I’ve been trying to understand how restrictive your definitions where.
      
      If I understand correctly you say that what people -in your experience- want is to know the “true parameter of a model that models a true data generating process” and whatever that means exactly you don’t think that Bayesian reasoning can get it, by definition.
      
      Given that you say that frequentist methods cannot get it either, that seems hardly a reason to reject Bayesian methods. If the problem is just that one can never find a true parameter for a true model maybe the researchers should change the object of their desire.
      
      Arguably, it may be the case that the closest thing to “what (you say) people want” that they may actually get is in fact a Bayesian solution. They don’t seem very satisfied with the frequentist solution, anyway, if they often mistake it for an epistemic result.
      
      I’m not sure either where does the perceived conflict lie, really. One can have epistemic probabilities about limiting frequencies or whatever.
    - Christian Hennig on January 31, 2022 11:54 AM at 11:54 am said:
      
      Carlos: I’m not rejecting Bayesian methods, I wrote this before. Bayesian methods can be combined with a frequentist interpretation of probability (this makes the motivation of the prior very hard in many cases, but in some others it makes some good sense to interpret it in a frequentist manner), and also I’m not against epistemic probability either.
      
      Aleatory (frequentist) probability models the real process whereas epistemic probability models our knowledge about it. Many scientists (although I don’t know how many) seem to be more comfortable with the former than with the latter. You are right that frequentist models are not automatically “closer to reality” in this way, ultimately they are models (as the Bayesian ones) and models are a tool for thinking. The job of models is not to be true, and I’m with you thinking that there are a number of not so good reasons to favour frequentist probabilities (particularly taking a frequentist model to be more objective and “real” than it is). To say “the researchers should change the object of their desire” is fair enough, and in fact one thing I do often when doing advisory or collaborating on applications is to try to get people to have more realistic expectations of what can be known (as a result of the data analysis) and what cannot be known.
      
      I do however think that modelling the real process is more straightforward as a job than modelling epistemic uncertainty about it, which of course is expressed in the necessity to specify a prior for epistemic uncertainty which you don’t have to do for a frequentist model. (I have written earlier why I don’t agree with the idea that the frequentist “assumes” a prior even without specifying it.)
      
      Consequently, probably the most important reason for many to not adopt a Bayesian approach is that they don’t want to specify a prior in situations in which it is not very clear how to translate existing information into it, or for the more general reason that even if that were possible, they want to separate the “message from the data” from their prior belief. I think people don’t feel that they stand on very solid ground with a certain prior choice (unless it can be well connected to existing observations; that’s a connection to the epistemic probability issue – they’d want to see something like a frequentist process generating parameters in order to feel good about a prior modelling it). But of course they won’t get an epistemic probability for any parameter to be in a certain set without specifying an epistemic prior.
    - Carlos Ungil on January 31, 2022 12:31 PM at 12:31 pm said:
      
      > I’m not rejecting Bayesian methods, I wrote this before.
      
      I was referring to the people who do.
      
      > I do however think that modelling the real process is more straightforward as a job than modelling epistemic uncertainty about it, which of course is expressed in the necessity to specify a prior for epistemic uncertainty which you don’t have to do for a frequentist model.
      
      As far as I can see, the model p(observation|mass of Saturn) doesn’t model our uncertainty about the mass of Saturn. This is, to the best of our knowledge or at least with the expectation of an acceptable level of accuracy, a model of the real physical processes that will generate data for any potential value of the parameter.
      
      Now, when we put it together with prior that quantifies your initial epistemic uncertainty we get some resulting epistemic uncertainty.
      
      When the aforemention people rejecting Bayesian methods inisist in using the model alone, what do they get? What can they say about the weight of Saturn?
      
      Nothing, unless they make additional assumptions – not so straightforward anymore. The mathematical model has to be at some point connected to the real-world data and the real-world thing they care about.
      
      > But of course they won’t get an epistemic probability for any parameter to be in a certain set without specifying an epistemic prior.
      
      Oh, boy, will they get it! They will just continue to misinterpret frequentist results as saying something about the concrete situation at hand – if that is what they want.
    - David Marcus on February 1, 2022 7:11 AM at 7:11 am said:
      
      Carlos wrote:
      
      > Oh, boy, will they get it! They will just continue to misinterpret
      > frequentist results as saying something about the concrete situation at
      > hand – if that is what they want.
      
      Yes.
  - Christian Hennig on January 29, 2022 8:16 PM at 8:16 pm said:
    
    I have no problems calling it prior information. I have never said that this should be ignored. (Whether and how it can or should be translated into a prior distribution is another matter.)
    
    Reply ↓
    - David Marcus on January 30, 2022 9:58 AM at 9:58 am said:
      
      So, you accept that prior information is required to do a statistical analysis, but don’t think this should be modeled using probability. Maybe that isn’t what you mean. Hard for me to tell. This is all just too vague for me.
      
      I don’t understand why some people cling to their refusal to model uncertainty using probability while accepting all sorts of peculiar things: P-values that depend on the stopping rule and include the probability of things not observed; confidence intervals whose probability is not the probability of the particular interval, but the probability of something happening in some imagined repetition of experiments (that won’t happen); every problem treated in an adhoc way because there is no general approach.
      
      I learned frequentist statistics first. Then I learned Bayesian statistics and discovered that what I had learned before was wrong. When you make a mistake, just admit it, and move on.
    - Andrew on January 30, 2022 10:10 AM at 10:10 am said:
      
      David:
      
      I like Bayesian statistics too but I think you’re misunderstanding the non-Bayeisan position expressed by Christian.
      
      Let me first say that, just as there is a variety of Bayesians (including the subjectivist cranks and the objectivist cranks, both of whom I disagree with), there is a variety of non-Bayesians.
      
      One non-Bayesian take, with which I disagree strongly, is the view that Bayesian inference lacks the rigor of classical hypothesis testing. I think that position is untenable for the simple reason that classical methods do not have the rigor that is claimed for them: purportedly unbiased estimates are not, in practice, unbiased; real-world confidence intervals don’t have their advertised coverage; etc. The anti-Bayesian hardliners who think of Bayesians as being non-rigorous are, to me, comparable to the Bayesian hardliners who think of non-Bayesians as being incoherent, while not recognizing the incoherences of real-world Bayesian inference.
      
      A different non-Bayesian take, which I think is closer to Christian’s, is that for scientific reasons it is helpful to separately assess the evidence from different sources. Hence, you can evaluate an experiment without using prior information, and then use that prior information later in decision analysis or meta-analysis or whatever. I take Christian as objecting to the position that one should always use Bayesian analyses or always interpret inferences as posterior probabilities. Just as we should not always interpret real-life 95% confidence intervals as having real-life 95% coverage, similarly we should not always interpret real-life posterior distributions as representing probabilities on which we would bet.
    - Christian Hennig on January 30, 2022 11:35 AM at 11:35 am said:
      
      @David: Do you read my postings properly? I have explicitly written that I’m *not* against Bayesian analysis, and that I’m actually a pluralist. I have just written that I don’t use it often. However I’m fine with a number of uses of it and I have seen well motivated priors that clearly have helped the analysis (including on this blog). In most situations in which I have done advisory, however, prior information didn’t seem very suitable for translating into a prior, and/or it wasn’t clear how doing it would improve the analysis. I also met skepticism from clients to do something that depends on a prior that they found hard to justify, which is part of the experience behind my claims in this thread. I will however admit that an advisor like me who doesn’t do Bayes routinely with determination and conviction is maybe not the best to talk the clients out of such an idea.
      
      Note also that there are many other uses of prior information than turning them into a prior distribution. Many of these are *not* about belief, but rather about potential use and interpretation of the analysis, for example regarding the nature of possible outliers, which has impact on whether it is appropriate to use a method that explicitly or implicitly removes them/gives them weight zero. Regarding binary sequences it is a major issue whether there are reasons to suspect observations to be dependent; very often there is such information, very rarely this information comes in a way that allows it to be neatly translated into a prior. Of course Bayesians will try harder than me to “translate” prior information into a prior distribution, and chances are that occasionally they manage to do so successfully in cases in which I wouldn’t have tried, however Bayesian analyses with priors that are not or very weakly motivated are endemic in the literature. This is a major point in Andrew’s and my paper on objectivity and subjectivity as linked earlier. The idea that prior information *only* comes in designing the prior is wrong, there are many other paths for it to enter. This means (a) that frequentists shouldn’t be accused of generally ignoring such prior information, but (b) also that they can’t claim to be more “objective” as there any many decisions in a frequentist analysis that require a “subjective” assessment of the situation.
    - Christian Hennig on January 30, 2022 11:42 AM at 11:42 am said:
      
      By the way I believe I have a rather good record reducing the number of tests and p-values my clients use, sometimes to zero. Be careful shoehorning people!
    - David Marcus on January 30, 2022 11:44 AM at 11:44 am said:
      
      Andrew,
      
      You may be right. But I can’t tell what Christian is saying, and it appears Carlos can’t either.
      
      Regarding your last paragraph, I don’t understand why a Bayesian can’t separately assess the evidence from different sources. I’m also not sure what you mean by “you can evaluate an experiment without using prior information”. Can you give an example of how you would do that?
      
      To me, Bayesian statistics just means: 1) Model uncertainty using probability. 2) Condition on the data. What part of doing this means you can’t “separately assess the evidence from different sources”?
      
      I’m not sure how to state what frequentist statistics is. It seems to be: Only use probability for physical processes.
    - David Marcus on January 30, 2022 1:09 PM at 1:09 pm said:
      
      Christian:
      
      > I’m actually a pluralist.
      
      My point is that the correct way to think about a statistical problem is to think about the uncertainty using probability and to condition on the data.
      
      That doesn’t mean you have to write down a prior. And, that certainly doesn’t mean you should do a “Bayesian” calculation using some prior you pulled out of thin air (or a software package or a textbook) without thinking about it. The model should be a good approximation to reality, where “model” means both likelihood and prior. If you can justify your calculation in such a framework, then that’s fine. My problem is with the people who say something like, “My frequentist analysis didn’t use a prior, so my calculation doesn’t depend on that stuff.”
    - Christian Hennig on January 30, 2022 1:20 PM at 1:20 pm said:
      
      @David:
      > But I can’t tell what Christian is saying, and it appears Carlos can’t either.
      My personal suspicion is that this is because you read more into my postings than was actually in them. Including trying to convince *me* why some people who don’t like epistemic probability & priors are wrong, whereas my point was only that they exist but not that they’re right. Or being surprised that I actually think prior information should be used!?
      
      > My point is that the correct way to think about a statistical problem is to think about the uncertainty using probability and to condition on the data.
      
      I’d be much more careful to talk about *the* correct way. I have convinced clients in certain situations that really all they should do is to use descriptive statistics and visualisation that they can properly understand themselves, rather than probability modeling, be it Bayesian or frequentist.
      
      > “My frequentist analysis didn’t use a prior, so my calculation doesn’t depend on that stuff.”
      
      Let’s agree to disagree on that one. (I have written earlier in this thread on this.)
Mikhail Шubin on January 26, 2022 5:50 AM at 5:50 am said:

“uniform on the range of 64 bit IEEE floats”

this would be about ~exp(2), right?

Reply ↓
- Daniel Lakeland on January 26, 2022 11:43 PM at 11:43 pm said:
  
  according to Julia:
  
  julia> floatmax(Float64)
  1.7976931348623157e308
  julia> log(floatmax(Float64))
  709.782712893384
  
  Reply ↓
  - Mikhail Shubin on January 28, 2022 12:44 PM at 12:44 pm said:
    
    I mean, if we take a random set of 62 bits and declare it to be a Float, we will get about exponentially distribute value (disregarding the sign), right?
    
    If we declare random 62 bits to be Int62, the resulting value would be uniformly distributed on [-minInt, maxInt], but floats are more complicated.
    
    Reply ↓
    - Daniel Lakeland on January 28, 2022 2:11 PM at 2:11 pm said:
      
      “uniformly distributed on **the range of the floats**” is different from “uniformly distributed on valid 64 bit floating point numbers”
    - Mikhail Shubin on January 29, 2022 9:21 AM at 9:21 am said:
      
      Ok, cmon, cant people have fun?

Statistical Modeling, Causal Inference, and Social Science

More on p-values etc etc etc

166 thoughts on “More on p-values etc etc etc”

Leave a Reply Cancel reply