Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

A debate about effect-size variation in psychology: Simmons and Simonsohn; McShane, Böckenholt, and Hansen; Judd and Kenny; and Stanley and Doucouliagos

A couple weeks ago, Uri Simonsohn and Joe Simmons sent me and others a note that they were writing a blog post citing some of our work and asking for us to point out anything that we find “inaccurate, unfair, snarky, misleading, or in want of a change for any reason.” I took a quick […]

Ballot order update

Darren Grant writes: Thanks for bringing my work on ballot order effects to the attention of a wider audience via your recent blog post. The final paper, slightly modified from the version you posted, was published last year in Public Choice. Like you, I am not wedded to traditional hypothesis testing, but think it is […]

Wanted: Statistical success stories

Bill Harris writes: Sometime when you get a free moment, it might be great to publish a post that links to good, current exemplars of analyses. There’s a current discussion about RCTs on a program evaluation mailing list I monitor. I posted links to your power=0.06 post and your Type S and Type M post, […]

No, its not correct to say that you can be 95% sure that the true value will be in the confidence interval

Hans van Maanen writes: Mag ik je weer een statistische vraag voorleggen? If I ask my frequentist statistician for a 95%-confidence interval, I can be 95% sure that the true value will be in the interval she just gave me. My visualisation is that she filled a bowl with 100 intervals, 95 of which do […]

Claims about excess road deaths on “4/20” don’t add up

Sam Harper writes: Since you’ve written about similar papers (that recent NRA study in NEJM, the birthday analysis) before and we linked to a few of your posts, I thought you might be interested in this recent blog post we wrote about a similar kind of study claiming that fatal motor vehicle crashes increase by 12% after 4:20pm […]

Lessons about statistics and research methods from that racial attitudes example

Yesterday we shared some discussions of recent survey results on racial attitudes. For students and teachers of statistics or research methods, I think the key takeaway should be that you don’t want to pull out just one number from a survey; you want to get the big picture by looking at multiple questions, multiple years, […]

“Sometimes all we have left are pictures and fear”: Dan Simpson talk in Columbia stat dept, 4pm Monday

4:10pm Monday, April 22 in Social Work Bldg room 903: Data is getting weirder. Statistical models and techniques are more complex than they have ever been. No one understand what code does. But at the same time, statistical tools are being used by a wider range of people than at any time in the past. […]

Abandoning statistical significance is both sensible and practical

Valentin Amrhein​, Sander Greenland, Blakeley McShane, and I write: Dr Ioannidis writes against our proposals [here and here] to abandon statistical significance in scientific reasoning and publication, as endorsed in the editorial of a recent special issue of an American Statistical Association journal devoted to moving to a “post p

Several reviews of Deborah Mayo’s new book, Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars

A few months ago I sent the following message to some people: Dear philosophically-inclined colleagues: I’d like to organize an online discussion of Deborah Mayo’s new book. The table of contents and some of the book are here at Google books, also in the attached pdf and in this post by Mayo. I think that […]

“The Long-Run Effects of America’s First Paid Maternity Leave Policy”: I need that trail of breadcrumbs.

Tyler Cowen links to a research article by Brenden Timpe, “The Long-Run Effects of America’s First Paid Maternity Leave Policy,” that begins as follows: This paper provides the first evidence of the effect of a U.S. paid maternity leave policy on the long-run outcomes of children. I exploit variation in access to paid leave that […]

Thinking about “Abandon statistical significance,” p-values, etc.

We had some good discussion the other day following up on the article, “Retire Statistical Significance,” by Valentin Amrhein, Sander Greenland, and Blake McShane. I have a lot to say, and it’s hard to put it all together, in part because my collaborators and I have said much of it already, in various forms. For […]

An R package for multiverse analysis and counting researcher degrees of freedom

Joachim Gassen writes: In a recent blog post I introduce an in-development R package that helps researchers to identify, document and exhaust inherent research design choices in work based on observational data. As the analysis that I propose is similar in notion to a multiverse analysis that you suggested, I thought that maybe the package […]

A comment about p-values from Art Owen, upon reading Deborah Mayo’s new book

The Stanford statistician writes: One of the fun parts of this was reading some of what Meehl wrote. I’d seen him quoted but had not read him before. What he says reminds me a lot of how p values were presented when I was an undergraduate at Waterloo. They emphasized large p values as a […]

How to approach a social science research problem when you have data and a couple different ways you could proceed?

tl;dr: Someone asks me a question, I can’t really tell what he’s talking about, so I offer some generic advice. Joe Hoover writes: An issue has come up in my subsequent analyses, which uses my MrsP estimates to explore the relationship between county-level moral values and the county-level distribution of hate groups, as defined by […]

Understanding how Anova relates to regression

Analysis of variance (Anova) models are a special case of multilevel regression models, but Anova, the procedure, has something extra: structure on the regression coefficients. As I put it in the rejoinder for my 2005 discussion paper: ANOVA is more important than ever because we are fitting models with many parameters, and these parameters can […]

Yes, I really really really like fake-data simulation, and I can’t stop talking about it.

Rajesh Venkatachalapathy writes: Recently, I had a conversation with a colleague of mine about the virtues of synthetic data and their role in data analysis. I think I’ve heard a sermon/talk or two where you mention this and also in your blog entries. But having convinced my colleague of this point, I am struggling to […]

“Retire Statistical Significance”: The discussion.

So, the paper by Valentin Amrhein, Sander Greenland, and Blake McShane that we discussed a few weeks ago has just appeared online as a comment piece in Nature, along with a letter with hundreds (or is it thousands?) of supporting signatures. Following the first circulation of that article, the authors of that article and some […]

My two talks in Montreal this Friday, 22 Mar

McGill University Biostatistics seminar, Purvis Hall, 102 Pine Ave. West, Room 25 Education Building, 3700 McTavish Street, Room 129 [note new location], 1-2pm Fri 22 Mar: Resolving the Replication Crisis Using Multilevel Modeling In recent years we have come to learn that many prominent studies in social science and medicine, conducted at leading research institutions, […]

Statistical-significance thinking is not just a bad way to publish, it’s also a bad way to think

Eric Loken writes: The table below was on your blog a few days ago, with the clear point about p-values (and even worse the significance versus non-significance) being a poor summary of data. The thought I’ve had lately, working with various groups of really smart and thoughtful researchers, is that Table 4 is also a […]

R package for Type M and Type S errors

Andy Garland Timm writes: My package for working with Type S/M errors in hypothesis testing, ‘retrodesign’, is now up on CRAN. It builds on the code provided by Gelman and Carlin (2014) with functions for calculating type S/M errors across a variety of effect sizes as suggested for design analysis in the paper, a function […]