Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

How to think scientifically about scientists’ proposals for fixing science

I kinda like this little article which I wrote a couple years ago while on the train from the airport. It will appear in the journal Socius. Here’s how it begins: Science is in crisis. Any doubt about this status has surely been been dispelled by the loud assurances to the contrary by various authority […]

What’s the p-value good for: I answer some questions.

Martin King writes: For a couple of decades (from about 1988 to 2006) I was employed as a support statistician, and became very interested in the p-value issue; hence my interest in your contribution to this debate. (I am not familiar with the p-value ‘reconciliation’ literature, as published after about 2005.) I would hugely appreciate […]

Glenn Shafer: “The Language of Betting as a Strategy for Statistical and Scientific Communication”

Glenn Shafer writes: I have joined the immense crowd writing about p-values. My proposal is to replace them with betting outcomes: the factor by which a bet against the hypothesis multiplies the money it risks. This addresses the desideratum you and Carlin identify: embrace all the uncertainty. No one will forget that the outcome of […]

BizStat: Modeling performance indicators for deals

Ben Hanowell writes: I’ve worked for tech companies for four years now. Most have a key performance indicator that seeks to measure the rate at which an event occurs. In the simplest case, think of the event as a one-off deal, say an attempt by a buy-side real estate agent to close a deal on […]

Jeff Leek: “Data science education as an economic and public health intervention – how statisticians can lead change in the world”

Jeff Leek from Johns Hopkins University is speaking in our statistics department seminar next week: Data science education as an economic and public health intervention – how statisticians can lead change in the world Time: 4:10pm Monday, October 7 Location: 903 School of Social Work Abstract: The data science revolution has led to massive new […]

Many perspectives on Deborah Mayo’s “Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars”

This is not new—these reviews appeared in slightly rawer form several months ago on the blog. After that, I reorganized the material slightly and sent to Harvard Data Science Review (motto: “A Microscopic, Telescopic, and Kaleidoscopic View of Data Science”) but unfortunately reached a reviewer who (a) didn’t like Mayo’s book, and (b) felt that […]

Controversies in the theory of measurement in mathematical psychology

We begin with this email from Guenter Trendler: On your blog you wrote: The replication crisis in social psychology (and science more generally) will not be solved by better statistics or by preregistered replications. It can only be solved by better measurement. Check this out: Measurement Theory, Psychology and the Revolution That Cannot Happen (pdf […]

Glenn Shafer tells us about the origins of “statistical significance”

Shafer writes: It turns out that Francis Edgeworth, who introduced “significant” in statistics, and Karl Pearson, who popularized it in statistics, used it differently than we do. For Edgeworth and Pearson, “being significant” meant “signifying”. An observed difference was significant if it signified a real difference, and you needed a very small p-value to be […]

Chow and Greenland: “Unconditional Interpretations of Statistics”

Zad Chow writes: I think your readers might find this paper [“To Aid Statistical Inference, Emphasize Unconditional Descriptions of Statistics,” by Greenland and Chow] interesting. It’s a relatively short paper that focuses on how conventional statistical modeling is based on assumptions that are often in the background and dubious, such as the presence of some […]

“Persistent metabolic youth in the aging female brain”??

A psychology researcher writes: I want to bring your attention to a new PNAS paper [Persistent metabolic youth in the aging female brain, by Manu Goyal, Tyler Blazey, Yi Su, Lars Couture, Tony Durbin, Randall Bateman, Tammie Benzinger, John Morris, Marcus Raichle, and Andrei Vlassenko] that’s all over the news. Can one do a regression […]

Dan’s Paper Corner: Can we model scientific discovery and what can we learn from the process?

Jesus taken serious by the many Jesus taken joyous by a few Jazz police are paid by J. Paul Getty Jazzers paid by J. Paul Getty II Leonard Cohen So I’m trying a new thing because like no one is really desperate for another five thousand word essay about whatever happens to be on my […]

“Statistical Inference Enables Bad Science; Statistical Thinking Enables Good Science”

As promised, let’s continue yesterday’s discussion of Christopher Tong’s article, “Statistical Inference Enables Bad Science; Statistical Thinking Enables Good Science.” First, the title, which makes an excellent point. It can be valuable to think about measurement, comparison, and variation, even if commonly-used statistical methods can mislead. This reminds me of the idea in decision analysis […]

Harking, Sharking, Tharking

Bert Gunter writes: You may already have seen this [“Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data,” John Hollenbeck, Patrick Wright]. It discusses many of the same themes that you and others have highlighted in the special American Statistician issue and elsewhere, but does so from a slightly different […]

My math is rusty

When I’m giving talks explaining how multilevel modeling can resolve some aspects of the replication crisis, I mention this well-known saying in mathematics: “When a problem is hard, solve it by embedding it in a harder problem.” As applied to statistics, the idea is that it could be hard to analyze a single small study, […]

My talk at the Metascience symposium Fri 6 Sep

The meeting is at Stanford, and here’s my talk: Embracing Variation and Accepting Uncertainty: Implications for Science and Metascience The world would be pretty horrible if your attitude on immigration could be affected by a subliminal smiley face, if elections were swung by shark attacks and college football games, if how you vote depended on […]

The methods playroom: Mondays 11-12:30

Each Monday 11-12:30 in the Lindsay Rogers room (707 International Affairs Bldg, Columbia University): The Methods Playroom is a place for us to work and discuss research problems in social science methods and statistics. Students and others can feel free to come to the playroom and work on their own projects, with the understanding that […]

What’s the origin of the term “chasing noise” as applying to overinterpreting noisy patterns in data?

Roy Mendelssohn writes: In an internal discussion at work I used the term “chasing noise”, which really grabbed a number of people involved in the discussion. Now my memory is I first saw the term (or something similar) in your blog. But it made me interested in who may have first used the term? Did […]

Beyond Power Calculations: Some questions, some answers

Brian Bucher (who describes himself as “just an engineer, not a statistician”) writes: I’ve read your paper with John Carlin, Beyond Power Calculations. Would you happen to know of instances in the published or unpublished literature that implement this type of design analysis, especially using your retrodesign() function [here’s an updated version from Andy Timm], […]

More on the piranha problem, the butterfly effect, unintended consequences, and the push-a-button, take-a-pill model of science

The other day we had some interesting discussion that I’d like to share. I started by contrasting the butterfly effect—the idea that a small, seemingly trivial, intervention at place A can potentially have a large, unpredictable effect at place B—with the “PNAS” or “Psychological Science” view of the world, in which small, seemingly trivial, intervention […]

You should (usually) log transform your positive data

The reason for log transforming your data is not to deal with skewness or to get closer to a normal distribution; that’s rarely what we care about. Validity, additivity, and linearity are typically much more important. The reason for log transformation is in many settings it should make additive and linear models make more sense. […]