## Lorraine Daston (1994): “How Probabilities Came to Be Objective and Subjective”

Sander Greenland points us to a paper by Lorraine Daston from 1994, How Probabilities Came to Be Objective and Subjective.

Also relevant are the papers by Glenn Shafer and Michael Cowles and Caroline Davis that we linked to a few months ago.

1. Aha, I’m rereading Hacking’s Emergency of Probability by happenstance. So great follow on.

• The emergence of Probability. Hmmmm. What a faux pas.

Sameera, please explain, I am curious……

• Joe,

I misspelled ‘Emergence’ & added ‘happenstance’. So it was a little embarrassing to correct it. But it was an inadvertent or maybe an unconscious pun about the crisis within statistics also; in that assignment of probabilities is confusing for most of us. In the political science field, some of the probability assignments seemed lampoonable.

• John Richters says:

Great recommendation. I’ll see your “How Probabilities …” recommendation and raise you Daston’s brilliant “The Moral Economy of Science”: https://pure.mpg.de/rest/items/item_2276978/component/file_3053949/content

• Justin says:

I like demos like coin flip and convergence of relative frequency of heads, quincunx shape and counts, dice and card game outcomes, balls from urns, etc., that work no matter your beliefs.

Justin

• Anonymous says:

Not just beliefs. Since almost all n=100 binary sequences have the property that the proportion of outcomes ~1/2, then they work almost no matter what the underlying mechanism generating the sequence is.

The mechanism could be the physics of coin flips (Euler’s equations of rigid body motion), or it could be an erratic, but deterministic, sequence of numbers (i.e. a “random” number generator). Almost any mechanism will give you that ~1/2 outcome. To get a consistent failure, you pretty well need a mechanism that specifically targets one of those extremely rare sequences with proportions far from 1/2, such as a coin with identical sides.

Hmmm … I wonder if the non-frequency interpretation of probabilities and Bayesian statistics is really just a tool for detecting when this happens, and the assumption that each sequence occurs with frequency 1/2^100 is just a gratuitous and un-testable (you can’t generate 2^100 sequences!) irrelevancy?

• Curious says:

If I code the 5 of Clubs as a 1 and the other 51 cards as 0 — what is the probability of selecting a 5 of Clubs with a single draw of a card?

• Anonymous says:

I think your point is that 1 and 0 wont appear ~1/2 in this case. My response is that the following two statements aren’t even close to being contradictory:

“Sometimes what you don’t know doesn’t make much difference because the outcomes are similar either way.”

and,

“Sometimes to get a good prediction, you need to know more about how the outcome was generated.”

• Curious says:

Agreed. If I am choosing among a group of brilliant applicants and brilliance translates into increased success of the organization — then my measurement methods and ability to differentiate among this group is irrelevant to the success of the company though perhaps not to the applicants or to society more generally. The same would be true if it were not possible to affect the success of the organization no matter the level of brilliance.

However, if I do care whether my models systematically discriminate against historically oppressed groups of people and applicants to vary along constructs that are in fact related to the future success of the organization, then perhaps I might want to understand the causal processes predictive of such. Because if members of a particular group perform worse at my company because they are systematically discriminated against in performance reviews and are systematically prevented from succeeding, that is a problem that cannot be solved simply by improving measurement and predictive models.

• Curious says:

Anonymous:

After thinking about this a bit, I will say that in my experience working on prediction models of human behavioral events that are coded as binary (1 = occurred, 0 = did not occur) that identifying measures with a positive hit rate of 0.5 was quite common, but increasing that substantially into the 0.7’s to the 0.9’s was far less common and required more knowledge and better measures.

Thus, a person with the highest score based on calculating the individual data by model parameters had about a 50% chance of having a 1 for the outcome.

• Anonymous says:

When one says, “the vast majority of cases will be near .5” there’s a bit of ambiguity in the language. This could mean a relative statement like:

“99.99% of the possible sequences will be within delta of .5”

If we’re thinking in absolute terms though, then for a 100 flip sequence, that .01% represents (2^100)/10000 possibilities, which is an insanely large number.

Those sequences are “rare” in relative terms, but “abundant” in absolute terms.

So if we know nothing about which of those 2^100 sequences we’ll see, then we’re justified in guessing in “something with a ratio near .5”, but in no sense is that “best guess” a guarantee of anything.

Reading these historical papers, and others, raises a question that has intrigued me for a long time. These texts deal with probabilities such as coin tosses, the fit between expected and measured frequencies in distributions, et al, giving credit to Pearson and Galton for formalizing these questions about fitting distributions, and to De Moivre, Gauss and Pearson for concepts of error and standard deviation.

So my question, about Mendel and his famous breeding experiments with inheritance of traits in peas and other plants. From empirical results for many crosses, with varying numbers in each case, he inferred his Laws of Inheritance, i.e. 1:1 or 1:2:1 ratios depending on the cross. He never found these precise ratios, but instead understood that each cross was a sample of a population. And from his samples he inferred the general rules.

All of this implies that he understood implicitly the principles of sampling and the therefore the issue of fitting empirical numbers to excepted results from true ratios None of this was expressed in terms of formal probabilities or fit between observed and expected distributions. But he must have understood these questions to do what he did. But his papers barely discuss these issues, and indeed seem to take the methods for granted. But his work preceded Galton and Pearson. So he could not be privy to their insights.

Was his thinking common at the time? Were his methods common practice? To what extent was he a pioneer in empirical aspects of probability and statistics as well as in genetics and the laws of inheritance.

(I know some have questioned his methods because his results fit expectations a bit too close for comfort. But at the time the methods for data processing were far more rudimentary than they are today, and I am reluctant to hold him to our standards. Regardless, his insights were deep, largely correct (there are exceptions), and prescient (I doubt he realized how general his Laws would be found to be.)

3. Matt Skaggs says:

Ouch. Just kept peeling that onion for objective and subjective probability, but the paper is about semantics. Which I kind of figured it would be, because I have long suspected that the concept of “probability” is so nebulous that it is essentially useless as a word that communicates an idea. To my prior (lack of) understanding I have now added the wisdom that objective probability is a shit sandwich, and subjective probability is a shit casserole, and they are different, dammit!

4. Megan Higgs says:

On the topic of probability, its history and foundations, related philosophy, and connections to inference in practice — I suggest the very accessible and interesting Willful Ignorance: The Mismeasure Of Uncertainty by Herbert Weisberg (2014).

https://www.wiley.com/en-us/Willful+Ignorance%3A+The+Mismeasure+of+Uncertainty-p-9780470890448