To follow up on a recent post, I thought it would be amusing to consider the most important unrecognized statisticians. That is, those statisticians of the past who made important contributions which have been largely forgotten.

Any suggestions? Dead people only, please.

I think Wassily Hoeffding is not recognized as much as his work deserves. In addition to his well-known inequality and the so called Fréchet-Hoeffding bounds, the rest of his work was mainly posthumously rescued in The Collected Works of Wassily Hoeffding (1994), N. I. Fisher and P. K. Sen, eds., Springer-Verlag, New York. For example, he envisioned the possibility of standardizing any joint distribution function, he was pretty close to the now very popular concept of copula functions derived from Sklar’s theorem.

Arturo:

This is potentially an excellent suggestion, given that I’d never heard of the guy myself!

Jerzy Neyman, inventor of confidence intervals and stratified sampling.

Desch:

Neyman is great but he’s hardly unrecognized!

Can I upvote a couple of philosophers who weren’t inclined to do calculus? Both were circling around causality and counterfactuals.

1. John Stuart Mill, for his conception of probability modeling as a kind of counterfactual. That is, you imagine that an outcome you observed could have turned out differently. This is the key move in philosophical modeling, causal or not. Mill also moved from a Bayesian perspective to a more frequentist perspective. There’s a fascinating discussion of Mill’s changing points of view M.G. Bulmer’s great little intro, Principles of Statistics on pages 5 and 6.

2. Gottlieb Frege, for formalizing the notion of “possible world”. It’s the first formalization I know of the idea that things could’ve been different. As such, it’s related to Kolmogorov’s modern definition of probability. Although Frege tried to formalize his system, Russell shot it down with his famous paradox; the notions weren’t properly formalized in logic until Saul Kripke’s work on modal logic in the 1950s.

Bob:

Of course, I am going to suggest CS Peirce, but not sure if he is still unrecognised.

But Frege apparently was bit ahead of him (from Wiki)

“Frege’s work on the logic of quantifiers had little influence on his contemporaries, although it was published four years before the work of Peirce” … “Peirce apparently was ignorant of Frege’s work”.

Can’t go wrong with Peirce! Or Dewey for that matter. I love the pragmatists.

I’ve spent a long time thinking about quantifiers—I used to be a Montague-style formal semanticist. One of my crowning achievements as a linguist was working out the semantics of “pied-piping” with quantifiers in categorial grammar.

But I had no idea Peirce wrote on quantifiers.

Bob: Not too sure about this (do you know their work?) but there are some possible leads here http://www.jfsowa.com/pubs/csp21st.pdf

(I did like this distinction “classified philosophers by the terms nothing else and something more”, Peirce definitely was in the “something more” category but did substantive work in the “nothing else” category.)

John Craig. the case is in Chapter 13 of Stigler’s Statistics on the Table. Possibly (if you believe Stigler, the inventor of log-likelinhood). http://en.wikipedia.org/wiki/John_Craig_(mathematician)

David Sprott: The study of the profile and marginal likelihood functions for conducting interval inference (pp. 106, “Statistical Inference in Science”, Sprott, D.).

FJ:

I enjoyed my interactions with Sprott but the profile and marginal likelihood functions for conducting interval inference never really delivered a fully satisfactory approach.

I agree profile and marginal likelihood functions are not the Panacea, but this is the case of any approach. It is a bit hard to find a “fully satisfactory approach”.

Several good answers (I hadn’t heard of John Craig or David Sprott, and John Stuart Mill and Frege are well known, of course, but I wasn’t aware of their contribution to statistical thought).

I would like to nominate Arthur Bowley. There is a recent biography of him (with a subtitle A Pioneer in Modern Statistics and Economics) by Andrew I. Dale and Samual Kotz, two authors whos statistical books I’ve read and enjoyed, and yet I’d never heard of Bowley.

Oscar Kempthorn and David Freedman.

+1 for David Freedman, especially his emphasis on “shoeleather”, and writing in places where users of statistics would be likely to read.

Martha:

In my experience Freedman was either a liar or a fool or a completely lazy person (although if you tell untruths via the expedient of carefully not checking the truth-value of your statements, this is a sort of lying, as far as I’m concerned), or perhaps all three. But I suppose we all have our good and bad aspects.

Well, you may know something I don’t. Can you give examples of where he appeared to be a liar or a fool or completely lazy?

I don’t have the exact quote here, but he wrote in 1995 that all my work used linear models. In writing this report, he had seen a paper of mine (later published in the Journal of the American Statistical Association and later to receive an award from the association) that used nonlinear differential equation models. So that was just false.

But, hey, nonlinear models are hard to find, and he was tired, maybe he didn’t feel like reading so carefully? Fine, but then he shouldn’t have made the false statement. He could, for example, have written something like: “I read papers X, Y, and Z, but I didn’t read the paper on toxicokinetics. Papers X, Y, and Z used linear models.” Such a statement wouldn’t’ve sounded so impressive but it would’ve been true.

Freedman valued rhetorical impressiveness over truth, and in many ways it served him well, as lots of people like his impressive rhetoric. True, he lost the opportunity to learn some interesting things about nonlinear Bayesian models, but that’s the tradeoff. Had he learned more, he wouldn’t’ve been able in good conscience to make some of his stirring but false statements.

Sounds like a genuine error to me, nothing malicious.

Rahul:

I have no idea what is malicious. I just said he was either lying or foolish or lazy. It was the sort of genuine error that someone makes when he is lying, or when he is too clueless to know when a model is nonlinear, or when he is too lazy to look.

I don’t know but my guess is that all three of these were going on here. I’m guessing he was lying, or at least comfortable with untruth, because when I corrected the statement he never apologized for it. If I’d slammed someone’s work based on a false statement of mine, and it turned out I’d made a mistake, I’d feel bad about it. I’m guessing he was clueless and lazy because I think that after years of not doing technical work, Freedman may well have been scared off by the notation in our paper—it was easier for him to offer a blanket criticism than to read the article.

But I have no idea if there was any malicious intent. My guess, based on what I heard from others, is that (a) Freedman thought of this sort of thing as an anything-goes game, where the goal was to win, not to assess the truth, and (b) he believed that he was on the side of some larger truth. So maybe he didn’t think the details mattered.

i was considering mentioning kempthorne. but was not sure if he is unrecognized.

fun question. it can be hard to tell if people have been lost to history. (and it depends on who one is talking to.) also, since i am pretty clueless about stuff, i often cannot tell of its importance.

unless one can get one’s name attached to some commonly used statistical method or have written a famous book which everyone uses, it is easy to be forgotten, especially for the layperson. and the way

things are taught, students are often not exposed to the history. just having a method named after some person though, their contributions can still be underestimated. eg, imagine if fisher were only known for andrew’s favorite test, fisher’s exact test. (i kid.)

Jimmy:

Kempthorne—really? He wrote that book on design and analysis of experiments, but is there really anything special there that’s not in other books? I associate Kempthorne with various clueless anti-Bayesian screeds in old-time JRSS-B discussions. (He had a memorable rant against the Lindley and Smith 1972 paper on hierarchical models where Lindley and Smith just destroyed him in the rejoinder in one sentence (see pages 13-14 here).

hah! that is an example of what i meant by me being clueless. i did not realize kempthorne had written a book. (i guess a wikipedia search would have informed me there. but the design books that i heard of are the box hunter hunter one, the cochran and cox one, and the one by fleiss.)

as alluded, i was trying to think of statisticians whose names i had heard of, but did not have a test named after them nor wrote a famous book.

as for the assessment of their work, i really do not know how to judge. so it is good to hear people’s opinions. will take a look at the paper.

and to possibly show my cluelessness again, the other name that i thought of was herb robbins. but he has an algorithm named after him. and that just seemed weird. because walking up to the 10th floor, everybody knows his name. i cannot tell if that is geographical bias though.

The Rosenbluths, for MCMC?

My vote is for Jerome Cornfield, at least if he is forgotten enough.

Cornfield and Tukey sorted out mixed effects beautifully.

Cornfield had a nice argument for Fisher who was saying that some

unknown factor might cause both smoking and lung cancer. The missing

factor had to be more strongly related to each of those than they

are to each other. Hard to fathom. [Maybe somebody asked Fisher

why non-smokers started getting cancer in the 20th century but

not so much before that?]

Somehow I thought there was a third thing, but I can’t put my finger

on it right now. However his `Statistician’s Apology’ article is quite sensible

and still resonates now if you’re of an applied bent.

Here’s a nomination that ought to go over like a lead zeppelin: Charles Geyer. Actually, this comment is just an excuse to mention what I feel is one of the most under-appreciated papers of all time: No

nasymptotics. Bayesians can appreciate it for its “nis always equal to one” orientation and for how it (implicitly) identifies how much data must accumulate to wash the prior out.Corey:

Charlie’s great. I think he’s still alive, so he doesn’t really work for the present discussion (see the last sentence of the above post), but he came up on this blog several years ago, as the author of one of the greatest works of statistics never published. It’s certainly reasonable to think that the author of a great unpublished paper could himself be under-recognized.

Oops!

Darn, dam rules about being dead.

Charles is definitely still alive, he was posting on Reddit an hour ago.

http://www.reddit.com/user/berf

Glad to hear it.

T.N. Thiele. There is an excellent book all about his excellent work, but it doesn’t seem to have been much noticed.

http://ukcatalogue.oup.com/product/9780198509721.do

Among other details, Karl Pearson was less than noble when he discovered Thiele’s prior work on moments. He cited Thiele disparingly in such a way that you got the impression that it was not worth following up the reference.

should be “disparagingly”