Gronau and Wagemakers write: The bridgesampling package facilitates the computation of the marginal likelihood for a wide range of different statistical models. For models implemented in Stan (such that the constants are retained), executing the code bridge_sampler(stanfit) automatically produces an estimate of the marginal likelihood. Full story is at the link.

**wagenmaker**

## Psychological Methods Feed

Someone writes: I’m emailing you about an email service I provide for some of the best blogs and podcasts on psychological methods. People can sign up for free and receive daily/real time emails containing the blog post(s). Below is the list. I had no idea there were so many blogs that discussed psychological methods. Oscar […]

## Many perspectives on Deborah Mayo’s “Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars”

This is not new—these reviews appeared in slightly rawer form several months ago on the blog. After that, I reorganized the material slightly and sent to Harvard Data Science Review (motto: “A Microscopic, Telescopic, and Kaleidoscopic View of Data Science”) but unfortunately reached a reviewer who (a) didn’t like Mayo’s book, and (b) felt that […]

## Against Arianism 3: Consider the cognitive models of the field

“You took my sadness out of context at the Mariners Apartment Complex” – Lana Del Rey It’s sunny, I’m in England, and I’m having a very tasty beer, and Lauren, Andrew, and I just finished a paper called The experiment is just as important as the likelihood in understanding the prior: A cautionary note on robust […]

## Several reviews of Deborah Mayo’s new book, Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars

A few months ago I sent the following message to some people: Dear philosophically-inclined colleagues: I’d like to organize an online discussion of Deborah Mayo’s new book. The table of contents and some of the book are here at Google books, also in the attached pdf and in this post by Mayo. I think that […]

## Published in 2018

R-squared for Bayesian regression models. {\em American Statistician}. (Andrew Gelman, Ben Goodrich, Jonah Gabry, and Aki Vehtari) Voter registration databases and MRP: Toward the use of large scale databases in public opinion research. {\em Political Analysis}. (Yair Ghitza and Andrew Gelman) Limitations of “Limitations of Bayesian leave-one-out cross-validation for model selection.” {\em Computational Brain and […]

## Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.”

Fritz Strack points us to this article, “When Both the Original Study and Its Failed Replication Are Correct: Feeling Observed Eliminates the Facial-Feedback Effect,” by Tom Noah, Yaacov Schul, and Ruth Mayo, who write: According to the facial-feedback hypothesis, the facial activity associated with particular emotional expressions can influence people’s affective experiences. Recently, a replication […]

## Limitations of “Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection”

“If you will believe in your heart and confess with your lips, surely you will be saved one day” – The Mountain Goats paraphrasing Romans 10:9 One of the weird things about working with people a lot is that it doesn’t always translate into multiple opportunities to see them talk. I’m pretty sure the only […]

## Comments on Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection

There is a recent pre-print Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection by Quentin Gronau and Eric-Jan Wagenmakers. Wagenmakers asked for comments and so here are my comments. Short version: They report a known limitation of LOO when it’s used in a non-recommended way for model selection. They report that their experiments show that […]

## The p-curve, p-uniform, and Hedges (1984) methods for meta-analysis under selection bias: An exchange with Blake McShane, Uri Simosohn, and Marcel van Assen

Blake McShane sent me some material related to a paper of his (McShane et al., 2016; see reference list below), regarding various methods for combining p-values for meta-analysis under selection bias. His remarks related to some things written by Uri Simonsohn and his colleagues, so I cc-ed Uri on the correspondence. After some back and […]

## A pivotal episode in the unfolding of the replication crisis

Axel Cleeremans writes: I appreciated your piece titled “What has happened down here is the winds have changed”. Your mini-history of what happened was truly enlightening — but you didn’t explicitly mention our failure to replicate Bargh’s slow walking effect. This was absolutely instrumental in triggering the replication crisis. As you know, the article was […]

## Stan Roundup, 10 November 2017

We’re in the heart of the academic season and there’s a lot going on. James Ramsey reported a critical performance regression bug in Stan 2.17 (this affects the latest CmdStan and PyStan, not the latest RStan). Sean Talts and Daniel Lee diagnosed the underlying problem as being with the change from char* to std::string arguments—you […]

## This Friday at noon, join this online colloquium on replication and reproducibility, featuring experts in economics, statistics, and psychology!

Justin Esarey writes: This Friday, October 27th at noon Eastern time, the International Methods Colloquium will host a roundtable discussion on the reproducibility crisis in social sciences and a recent proposal to impose a stricter threshold for statistical significance. The discussion is motivated by a paper, “Redefine statistical significance,” recently published in Nature Human Behavior (and available […]

## “Bayesian evidence synthesis”

Donny Williams writes: My colleagues and I have a paper recently accepted in the journal Psychological Science in which we “bang” on Bayes factors. We explicitly show how the Bayes factor varies according to tau (I thought you might find this interesting for yourself and your blog’s readers). There is also a very nice figure. […]

## When considering proposals for redefining or abandoning statistical significance, remember that their effects on science will only be indirect!

John Schwenkler organized a discussion on this hot topic, featuring posts by – Dan Benjamin, Jim Berger, Magnus Johannesson, Valen Johnson, Brian Nosek, and E. J. Wagenmakers – Felipe De Brigard – Kenny Easwaran – Andrew Gelman and Blake McShane – Kiley Hamlin – Edouard Machery – Deborah Mayo – “Neuroskeptic” – Michael Strevens – […]

## Response to some comments on “Abandon Statistical Significance”

The other day, Blake McShane, David Gal, Christian Robert, Jennifer Tackett, and I wrote a paper, Abandon Statistical Significance, that began: In science publishing and many areas of research, the status quo is a lexicographic decision rule in which any result is first required to have a p-value that surpasses the 0.05 threshold and only […]

## “Bayes factor”: where the term came from, and some references to why I generally hate it

Someone asked: Do you know when this term was coined or by whom? Kass and Raftery’s use of the tem as the title of their 1995 paper suggests that it was still novel then, but I have not noticed in the paper any information about where it started. I replied: According to Etz and Wagenmakers […]

## Abraham Lincoln and confidence intervals

Our recent discussion with mathematician Russ Lyons on confidence intervals reminded me of a famous logic paradox, in which equality is not as simple as it seems. The classic example goes as follows: Abraham Lincoln is the 16th president of the United States, but this does not mean that one can substitute the two expressions […]

## Yes, despite what you may have heard, you can easily fit hierarchical mixture models in Stan

There was some confusion on the Stan list that I wanted to clear up, having to do with fitting mixture models. Someone quoted this from John Kruschke’s book, Doing Bayesian Data Analysis: The lack of discrete parameters in Stan means that we cannot do model comparison as a hierarchical model with an indexical parameter at […]

## “Marginally Significant Effects as Evidence for Hypotheses: Changing Attitudes Over Four Decades”

Kevin Lewis sends along this article by Laura Pritschet, Derek Powell, and Zachary Horne, who write: Some effects are statistically significant. Other effects do not reach the threshold of statistical significance and are sometimes described as “marginally significant” or as “approaching significance.” Although the concept of marginal significance is widely deployed in academic psychology, there […]