“He had acquired his belief not by honestly earning it in patient investigation, but by stifling his doubts. And although in the end he may have felt so sure about it that he could not think otherwise, yet inasmuch as he had knowingly and willingly worked himself into that frame of mind, he must be held responsible for it.”

Posted on April 10, 2024 9:39 AM by Andrew

Ron Bloom points us to this wonderful article, “The Ethics of Belief,” by the mathematician William Clifford, also known for Clifford algebras. The article is related to some things I’ve written about evidence vs. truth (see here and here) but much more beautifully put. Here’s how it begins:

A shipowner was about to send to sea an emigrant-ship. He knew that she was old, and not overwell built at the first; that she had seen many seas and climes, and often had needed repairs. Doubts had been suggested to him that possibly she was not seaworthy. These doubts preyed upon his mind, and made him unhappy; he thought that perhaps he ought to have her thoroughly overhauled and refitted, even though this should put him to great expense. Before the ship sailed, however, he succeeded in overcoming these melancholy reflections. He said to himself that she had gone safely through so many voyages and weathered so many storms that it was idle to suppose she would not come safely home from this trip also. He would put his trust in Providence, which could hardly fail to protect all these unhappy families that were leaving their fatherland to seek for better times elsewhere. He would dismiss from his mind all ungenerous suspicions about the honesty of builders and contractors. In such ways he acquired a sincere and comfortable conviction that his vessel was thoroughly safe and seaworthy; he watched her departure with a light heart, and benevolent wishes for the success of the exiles in their strange new home that was to be; and he got his insurance-money when she went down in mid-ocean and told no tales.

What shall we say of him? Surely this, that he was verily guilty of the death of those men. It is admitted that he did sincerely believe in the soundness of his ship; but the sincerity of his conviction can in no wise help him, because he had no right to believe on such evidence as was before him. He had acquired his belief not by honestly earning it in patient investigation, but by stifling his doubts. And although in the end he may have felt so sure about it that he could not think otherwise, yet inasmuch as he had knowingly and willingly worked himself into that frame of mind, he must be held responsible for it.

Clifford’s article is from 1877!

Bloom writes:

One can go over this in two passes. One pass may be read as “moral philosophy.”

But the second pass helps one think a bit about how one ought to make precise the concept of ‘relevance’ in “relevant evidence.”

Specifically (this is remarkably deficient in the Bayesian corpus I find) I would argue that when we say “all probabilities are relative to evidence” and write the symbolic form straightaway P(A|E) we are cheating. We have not faced the fact — I think — that not every “E” has any bearing (“relevance”) one way or another on A and that it is *inadmissible* to combine the symbols because it is so easy to write ’em down. Perhaps one evades the problem by saying, well what do you *think* is the case. Perhaps you might say, “I think that E is irrelevant if P(A|E) = P(A|~E).” But that begs the question: it says in effect that *both* E and ~E can be regarded as “evidence” for A. I argue that easily leads to nonsense. To regard any utterance or claim as “evidence” for any other utterance or claim leads to absurdities. Here for instance:

A = “Water ice of sufficient quantity to maintain a lunar base will be found in the spectral analysis of the plume of the crashed lunar polar orbiter.”

E = If there are martians living on the Moon of Jupiter, Europa, then they celebrate their Martian Christmas by eating Martian toast with Martian jam.

Is E evidence for A? is ~E evidence for A? Is any far-fetched hypothetical evidence for any other hypothetical whatsoever?

Just to provide some “evidence” that I am not being entirely facetious about the Lunar orbiter; I attach also a link to now much superannuated item concerning that very intricate “experiment” — I believe in the end there was some spectral evidence turned up consistent with something like a teaspoon’s worth of water-ice per 25 square Km.

P.S. Just to make the connection super-clear, I’d say that Clifford’s characterization, “He had acquired his belief not by honestly earning it in patient investigation, but by stifling his doubts. And although in the end he may have felt so sure about it that he could not think otherwise, yet inasmuch as he had knowingly and willingly worked himself into that frame of mind, he must be held responsible for it,” is an excellent description of those Harvard professors who notoriously endorsed the statement, “the replication rate in psychology is quite high—indeed, it is statistically indistinguishable from 100%.” Also a good match to those Columbia administrators who signed off on those U.S. News numbers. In neither case did a ship go down; it’s the same philosophical principle but lower stakes. Just millions of dollars involved, no lives lost.

As Isaac Asimov put it, “A robot may not injure a human being or, through inaction, allow a human being to come to harm.” Sometimes that inaction is pretty damn active, when a shipowner or a scientific researcher or a university administrator puts in some extra effort to avoid looking at some pretty clear criticisms.

20 thoughts on ““He had acquired his belief not by honestly earning it in patient investigation, but by stifling his doubts. And although in the end he may have felt so sure about it that he could not think otherwise, yet inasmuch as he had knowingly and willingly worked himself into that frame of mind, he must be held responsible for it.””

Anonymous on April 10, 2024 10:41 AM at 10:41 am said:

Very interesting.

You state “Ron Bloom writes …” but I can’t find a link to the entire Bloom article/blog. Was this published, or a private communication to you?

Reply ↓
- Andrew on April 10, 2024 10:48 AM at 10:48 am said:
  
  Anon:
  
  Yeah, he sent me an email.
  
  Reply ↓
- Anoneuoid on April 10, 2024 6:37 PM at 6:37 pm said:
  
  I think theyre mixing up the p(A|B) with p(H|E) examples.
  
  Eg, whats the prob A) its raining, given B) theres a pizza in the oven? There is some correlation there you can use to calculate a probability. But using the term evidence in this way is strange.
  
  Reply ↓
  - Anoneuoid on April 10, 2024 6:38 PM at 6:38 pm said:
    
    Was responding to Carlos.
    
    Reply ↓
Anoneuoid on April 10, 2024 1:05 PM at 1:05 pm said:

Is E evidence for A? is ~E evidence for A? Is any far-fetched hypothetical evidence for any other hypothetical whatsoever?

I guess A refers to some theory here. I think things will clear up if you realize p(A|E) is inversely related to p(A)*p(E|A) + p(B)*p(E|B) + … + p(Z)*p(E|Z)

Ie, p(A|E) is a relative measure, used to compare competing theories/explanations/whatever. You don’t look at A in isolation.

Reply ↓
Daniel Lakeland on April 10, 2024 1:20 PM at 1:20 pm said:

Let’s consider some base of facts B, which is a set of symbolic expressions that we have some rule for “interpreting” for example the symbol “jam” refers to things in the world that are made of cooked fruit and sugar and form a thick paste.

Suppose we have a set of facts B for Base, and we have a probability assignment for a symbolic assertion A about the world.

p(A | B)

Now suppose we add some new fact F
p(A | B,F)

as well you could form a set S = B Union F and say p(A | S) so we should interpret the comma as a set union

now we could as well form the probability assignment p(A|B,not F)

Now if p(A|B) = p(A | B,F) = p(A | B,not F) we could say that “F is irrelevant for A” as a definition of “irrelevant”.

In the ship example, for those who are atheist, we could say “gods mercy” *should be irrelevant* to the assignment of probability as to whether the ship will sink.

In a less religious context we could say for example F = “the ship sailed safely 20 years ago” is also irrelevant (or, nearly irrelevant) to the question of whether it will sink today, because too many things could have happened to the ship over a 20 years period.

That there exist irrelevant facts does not detract from Bayes theorem, it only detracts from improper applications of the whole mechanism. If you incorrectly assign relevance to irrelevant facts, you commit a modeling error, not a logical error. That is, whether a thing is irrelevant or not is not something we can determine from pure logic, but rather requires a factual connection to the meaning in the world of the fact F.

The size of your big toe wart is irrelevant to the price of tea in china… unless toe warts are in a major pandemic outbreak and soaking in tea is a proven treatment widely reported in the news.

Reply ↓
- Lukas Lohse on April 11, 2024 7:57 AM at 7:57 am said:
  
  >as well you could form a set S = B Union F and say p(A | S) so we should interpret the comma as a set union
  
  Surely you mean set intersection, i.e. the “worlds” where both B and F are true form the intersection of all the “worlds” with B and all the ones with F. The union corresponds to an or-connection.
  
  Reply ↓
  - Daniel Lakeland on April 12, 2024 1:27 AM at 1:27 am said:
    
    No I mean a union of the set of facts, that is we are assuming all the facts are true so they all go in the set of true facts, both the ones in B and the extra facts in F. The intersection would normally be the null set.
    
    Consider
    P(fish are wet |{Grass is green, sky is blue},{fish swim in water})
    
    The intersection of those two sets is {} the union is {grass is green, sky is blue, fish swim in water}
    
    Reply ↓
    - Lukas Lohse on April 12, 2024 5:44 AM at 5:44 am said:
      
      I guess that makes sense in isolation, but it doesn’t really gel with the definition of a probability measure as a function applied to subsets of an abstract set Omega. I don’t know your math background, so sorry if I’m over or under explaining.
      P({w in Omega, such that “fish are wet”} | {w in Omega, such that “Grass is green, sky is blue” and “fish swim in water”}) is a mathematically meaningful expression and uses and/intersect.
      Your approach might be better for people with little experience with the math sets, but I personally think it’s bad form to mix it up like that.
    - Daniel Lakeland on April 12, 2024 1:42 PM at 1:42 pm said:
      
      There are a number of confusions going on here I guess.
      
      First off, the foundation of Bayesian probability under Cox’s axioms *is not* measure theory. In this context probability generalizes boolean logic on propositions (such as “fish are wet”). probability is an assignment of a plausibility or credence that we give to the truth of the statement conditional on knowing the truth of the conditional statements. In that context, a union of conditional statements is simply specifying all the things we know are true, hence a very natural thing. They are statements in a formal language, not sets of outcomes though.
      
      Second off even within a measure theoretic context, the level at which I’m discussing is at the level of formal languages “above” the measure theory. That is, before we assign meaning. So “p(A | B,C) can be replaced with p(A | B U C)” is a transformation rule on sentences in the formal language. The “meaning” of p(A|B U C) is where the “intersection” takes place… that is
      
      “calculate p(A|B U C) by size(w in Omega such that A(w) holds) / size(w in Omega such that for all statements x in B U C x(w) holds)”
      
      In my opinion much of the problem in discussing probability comes from “jumping the gun” into measure theory and kolmogorov axioms rather than discussing probability as a formal language for specifying logical connections between propositions.
    - Lukas Lohse on April 15, 2024 4:30 AM at 4:30 am said:
      
      > In my opinion much of the problem in discussing probability comes from “jumping the gun” into measure theory and kolmogorov axioms rather than discussing probability as a formal language for specifying logical connections between propositions.
      
      Well, I certainly never even considered the possibility of other reasonable definitions of probability. I’m not going to pretend that I have anything useful to say about this, but I do want to say thank you for spelling it out for me.
- RM Bloom on April 19, 2024 4:21 PM at 4:21 pm said:
  
  Commas and brackets of different shapes and sizes will not wear the trousers.
  The issue is not notation it is the completeness and/or consistency of some set of concepts within some ostensible deductive system.
  The ostensible deductive system allows for any statement about the world to be assigned a so-called “probability” and allows for any other statement about the world to be the “conditioning” event.
  That at any rate is the maximalist position.
  That all of our assertions, or rather, our beliefs — if they be related rationally, so to speak — are linked together by the web of that calculus.
  However we go about digging up support for those beliefs; they are ostensibly so linked.
  My point is that without constraint this system –seeming so powerful — is in fact over-powered.
  It is over-powered in the same way that unconstrained predicates bandied about in naive set theory lead to simple paradoxes like Russell’s.
  It is “too powerful” and needs to be cut down to size, so to speak.
  Just like Naive Set Theory needed to be cut down to size to weed out some of its more paradoxical predications.
  Not every “assertion” is relevant conditioning “evidence” for every other assertion.
  Some are not linked and enjoy no such P(A|B).
  At least not on the “available evidence” (I do see the humor in this story by the way).
  Very kind regards,
  RM Bloom
  
  Reply ↓
  - Daniel Lakeland on April 19, 2024 6:41 PM at 6:41 pm said:
    
    But probability has a way of saying that p(A|B,Stuff) = p(A | Stuff) which is to say that B is not informative for A. We both agree that “the shape of a certain asteroid in the Kuiper belt is not informative for “the location of my red socks”. But that doesn’t mean you can’t say p(my red socks are in my drawer | the certain asteroid is thin like a pencil, other knowledge at my disposal), it’s just that we’d get the same result as p(my red socks are in my drawer | other knowledge at my disposal) since the asteroid shape is irrelevant.
    
    We don’t need to restrict what we can put on the right hand side, but we *do* need to restrict how we assign the values of the various probabilities.
    
    Reply ↓
Alex C. on April 10, 2024 2:30 PM at 2:30 pm said:

Something very similar happened with the Space Shuttle Challenger. Every time the shuttle survived a mission, the NASA workers became more (over)confident in the safety of the hardware.

Reply ↓
Sean on April 10, 2024 2:37 PM at 2:37 pm said:

This is related to the question of what to do when someone gets drunk or high before committing a violent crime. If you let the altered state of consciousness be a mitigating factor, that encourages criminals to work themselves up to doing something awful with the help of a bottle, a pipe, or a needle.

Reply ↓
Olaf Zimmermann on April 10, 2024 2:48 PM at 2:48 pm said:

@Andrew: I cannot thank you enough for this posting. (PS Psychology proper was a research programme begun in the 1860s and pretty much accomplished by 1980, since when we’ve had all manner of vocabulary shifts and worse …)

Reply ↓
Carlos Ungil on April 10, 2024 5:49 PM at 5:49 pm said:

> Specifically (this is remarkably deficient in the Bayesian corpus I find) I would argue that when we say “all probabilities are relative to evidence” and write the symbolic form straightaway P(A|E) we are cheating.

That’s all very confusing. Probabilities (of uncertain things) are relative to whatever it’s known (the things assumed certain). E is not “evidence for A”, it’s just evidence. Of course not everything is “relevant” and not everything has to be relevant for the symbols to make sense. The Bayesian corpus includes concepts like independence and mutual information – absurdities are easily avoidable.

Reply ↓
Gregory C. Mayer on April 11, 2024 6:51 AM at 6:51 am said:

Clifford’s argument is at least reasonably well-known in “critical thinking” and “skeptical” circles. Carl Sagan used the two paragraphs quoted by Bloom as the opening epigraph (pp. 207-208) to chapter 13, “Obsessed with Reality”, of his 1996 book, “The Demon-Haunted World” [ https://archive.org/download/DemonHauntedWorld_carlSagan/Sagan_-_The_Demon-Haunted_World___Science_as_a_candle_in_the_dark.pdf ], which is where I first learned it.

Reply ↓
- Paul Hayes on April 12, 2024 9:33 PM at 9:33 pm said:
  
  And of course Clifford’s argument (and the ethical / moral aspects of inquiry and belief more generally) is also reasonably well-known in philosophy circles (e.g.). Unsurpisingly, given the content of many of the articles! it’s come up in comments here before too (I know I’ve posted about it and included that and/or the Wikipedia link a number of times in comments here). I try to bring it to people’s attention at every opportunity.
  
  Reply ↓
Roxana on May 29, 2024 12:18 PM at 12:18 pm said:

I’m reminded slightly of Curry’s paradox/Löb’s theorem: https://en.wikipedia.org/wiki/Curry%27s_paradox

which is slightly more tautological/self-referential than the example given here.

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Leave a Reply Cancel reply