A longstanding principle in statistics

Posted on January 26, 2009 5:03 PM by Andrew

Hal Pashler writes:

I [Pashler] thought you guys would enjoy this charming little 1950 paper by Edward Cureton entitled “Validity, Reliability, and Baloney” (Dirk Vorberg, a German math psych guy, sent it). Long before machine learning, it seems that psychometrics people were confronting this issue–and the concrete form it took was “What should we make of validation measures computed with the same data that were used to select out particular items for inclusion in the test?”. Just swap voxels for items, and it’s the same problem [as in the Vul, Harris, Winkelman, and Pashler paper on suspiciously high correlations in bran imaging studies].

This reminds me of a longstanding principle in statistics, which is that, whatever you do, somebody in psychometrics already did it long before. I’ve noticed this a few times. Once, about ten years ago, I was at a conference where computer scientists were talking about some pretty elaborate statistical models, and I realized these were the same as some things I’d seen Iven Van Mechelen and his colleagues working on in the Psychology Department at Leuven. Then, more recently, I wrote this article with David Park on splitting a predictor into three parts, and it turned out that similar work had been done in 1928! by psychometric researcher T. L. Kelley (and, oddly enough, E. Cureton in 1957).

3 thoughts on “A longstanding principle in statistics”

Frank on January 26, 2009 5:42 PM at 5:42 pm said:

Unfortunately social scientist are better at reinventing the wheel than accumulating knowledge.

I wonder if we need to demand more extensive literature reviews, on the one hand, and faster conversion of vintage articles in machine readable format.

The way we go about learning seems quite inefficient.
EmilyKennedy on January 27, 2009 5:58 AM at 5:58 am said:

Wouldn't it be great if we had Anathem style Lorites in our academic communities?

http://anathem.wikia.com/wiki/Lorite
Laurens on January 29, 2009 10:23 PM at 10:23 pm said:

I've recently noticed this too. To my surprise, I found out that a multidimensional scaling variant called Stochastic Neighbor Embedding (Hinton & Roweis, 2002) is identical to a psychometrics model called the similarity choice model (Shepard, 1957; Luce, 1963).

It makes you wonder which pearls are still hidden in old volumes of Psychometrika!

Comments are closed.