Forget about multiple testing corrections. Actually, forget about hypothesis testing entirely.

Tai Huang writes: I am reading this paper [Why we (usually) don’t have to worry about multiple comparisons, by Jennifer, Masanao, and myself]. I am searching how to do multiple comparisons correctly under Bayesian inference for A/B/C testing. For the traditional t-test approach, Bonferroni correction is needed to correct alpha value. I am confused with […]

“It’s not just that the emperor has no clothes, it’s more like the emperor has been standing in the public square for fifteen years screaming, I’m naked! I’m naked! Look at me! And the scientific establishment is like, Wow, what a beautiful outfit.”

I happened to come across this post while writing on another topic (the ever-popular “critical positivity ratio”) and I thought the title was so great I just had to post it again. Someone still needs to build that bot for me that reposts all the old posts from this blog, starting at the beginning in […]

Don’t talk about hypotheses as being “either confirmed, partially confirmed, or rejected”

Kevin Lewis points us to this article by Paige Shaffer et al., “Gambling Research and Funding Biases,” which reports, “Gambling industry funded studies were no more likely than studies not funded by the gambling industry to report either confirmed, partially confirmed, or rejected hypotheses.” The paradox is that this particular study was itself funded by […]

The latest Perry Preschool analysis: Noisy data + noisy methods + flexible summarizing = Big claims

Dean Eckles writes: Since I know you’re interested in Heckman’s continued analysis of early childhood interventions, I thought I’d send this along: The intervention is so early, it is in their parents’ childhoods. See the “Perry Preschool Project Outcomes in the Next Generation” press release and the associated working paper. The estimated effects are huge: […]

How to get out of the credulity rut (regression discontinuity edition): Getting beyond whack-a-mole

This one’s buggin me. We’re in a situation now with forking paths in applied-statistics-being-done-by-economists where we were, about ten years ago, in applied-statistics-being-done-by-psychologists. (I was going to use the terms “econometrics” and “psychometrics” here, but that’s not quite right, because I think these mistakes are mostly being made, by applied researchers in economics and psychology, […]

Progress in the past decade

It’s been a busy decade for our research. Before going on, I’d like to thank hundreds of collaborators, including students; funders from government, nonprofits, and private industry; blog commenters and people who have pointed us to inspiring research, outrages, beautiful and ugly graphs, cat pictures, and all the rest; all those of you who have […]

“Why we sleep” data manipulation: A smoking gun?

In his post, Matthew Walker’s “Why We Sleep” Is Riddled with Scientific and Factual Errors” (see our discussions here, here, and here), Alexey Guzey added the following stunner: We’ve left “super-important researcher too busy to respond to picky comments” territory and left “well-intentioned but sloppy researcher can’t keep track of citations” territory and entered “research […]

Horns! Have we reached a new era in skeptical science journalism? I hope so.

Pointing us to this news article from Aylin Woodward, “No, we’re probably not growing horns from our heads because of our cellphone use — here’s the real science,” Jordan Anaya writes: I haven’t looked into it, but seems like your basic terrible study with an attention grabbing headline. Pretty much just mention cell phone use […]

Response to criticisms of Bayesian statistics

I just happened to reread this article of mine from 2008, and I still like it! So I’m linking to it here. Enjoy.

Judith Rich Harris on the garden of forking paths

Ethan Ludwin-Peery writes: I finally got around to reading The Nurture Assumption and I was surprised to find Judith Rich Harris quite lucidly describing the garden of forking paths / p-hacking on pages 17 and 18 of the book. The edition I have is from 2009, so it predates most of the discussion of these […]

“Inferential statistics as descriptive statistics”

Valentin Amrhein​, David Trafimow, and Sander Greenland write: Statistical inference often fails to replicate. One reason is that many results may be selected for drawing inference because some threshold of a statistic like the P-value was crossed, leading to biased reported effect sizes. Nonetheless, considerable non-replication is to be expected even without selective reporting, and […]

“Deep Origins” and spatial correlations

Morgan Kelly writes: Back in 2013 you had a column in Chance magazine on the Ashraf-Galor “Out of Africa” paper which claims that genetic diversity determines modern income. That paper is part of a much large literature in economics on Persistence or “Deep Origins” that shows how medieval pogroms prefigure Nazi support, adoption of the […]

Unquestionable Research Practices

Hi! (This is Dan.) The glorious Josh Loftus from NYU just asked the following question. Obviously he’s not heard of preregistration. Seriously though, it’s always good to remember that a lot of ink being spilled over hypothesis testing and it’s statistical brethren doesn’t mean that if we fix that we’ll fix anything.  It all comes to […]

What comes after Vixra?

OK, so Arxiv publishes anything. But some things are so cranky that Arxiv won’t publish them, so they go on Vixra. Here’s my question: where do the people publish, who can’t publish on Vixra? The cranks’ cranks, as it were? It’s a Cantor’s corner kinda thing.

In short, adding more animals to your experiment is fine. The problem is in using statistical significance to make decisions about what to conclude from your data.

Denis Jabaudon writes: I was thinking that perhaps you could help me with the following “paradox?” that I often find myself in when discussing with students (I am a basic neuroscientist and my unit of counting is usually cells or animals): When performing a “pilot” study on say 5 animals, and finding an “almost significant” […]

A reduction in error rate of 400-600%: Pretty good, huh?

In comments to the previous post, Alexey Guzey points to this bit from his post on sleep legend Matthew Walker: In The Lancet, Walker writes: pilot studies have shown that when you limit trainee doctors to no more than a 16 h shift, with at least an 8 h rest opportunity before the next shift, […]

“Whether something is statistically significant is itself a very random feature of data, so in this case you’re essentially outsourcing your modeling decision to a random number”

I happened to come across a post of mine that’s not scheduled until next April, and I noticed the above line, which I really liked, so I’m sharing it with you right now here. The comment relates to a common procedure in statistics, where researchers decide exclude potentially important interactions from their models, just because […]

Break out the marshmallows, friends: Ego depletion is due to change sign!

In a paper amusingly titled, “Ego depletion may disappear by 2020,” Miguel Vadillo (link from Kevin Lewis) writes: Ego depletion has been successfully replicated in hundreds of studies. Yet the most recent large-scale Registered Replication Reports (RRR), comprising thousands of participants, have yielded disappointingly small effects, sometimes even failing to reach statistical significance. Although these […]

In research as in negotiation: Be willing to walk away, don’t paint yourself into a corner, leave no hostages to fortune

There’s a saying in negotiation that the most powerful asset is the ability to walk away from the deal. Similarly, in science (or engineering, business decision making, etc.), you have to be willing to give up your favorite ideas. When I look at various embarrassing examples in science during the past decade, a common thread […]

Why do a within-person rather than a between-person experiment?

Zach Horne writes: A student of mine was presenting at the annual meeting of the Law and Society Association. She sent me this note after she gave her talk: I presented some research at LSA which used a within subject design. I got attacked during the Q&A session for using a within subjects design and […]