Bayesian Linear Mixed Models using Stan: A tutorial for psychologists, linguists, and cognitive scientists

Posted on July 19, 2016 9:26 AM by Andrew

This article by Tanner Sorensen, Sven Hohenstein, and Shravan Vasishth might be of interest to some of you.

33 thoughts on “Bayesian Linear Mixed Models using Stan: A tutorial for psychologists, linguists, and cognitive scientists”

leoboiko on July 19, 2016 10:25 AM at 10:25 am said:

Oh it is! Thank you!

Reply ↓
Ben on July 19, 2016 1:06 PM at 1:06 pm said:

Adding my thanks, too.

Reply ↓
Shravan on July 19, 2016 5:39 PM at 5:39 pm said:

This paper has an interesting history, for those suffering repeated rejections.

1. We first submitted it to PLoS ONE, rejected without review. Reason: don’t do tutorials.
2. Then we submitted it to Journal of Math Psych. Rejected after review. Major comment from a reviewer: why do we need a tutorial? Just read the (800 page) Stan manual. Fertig.
3. Then we rewrote it and submitted it to a special issue in Psych Methods. Rejected without review.
4. Then we submitted it to a special issue in Zeitschrift fuer Psychologie or some such journal I have never heard of. Apparently the oldest journal in psych. There was a special issue. Long silence, then we were informed that the special issue was cancelled because nobody submitted except us (maybe one or two others? I don’t remember). We withdrew it and I don’t think we got any reviews.
5. Then we submitted it to an open access journal Quantitative Methods for Psychology. This was finally reviewed (and the reviews were very helpful and improved the paper), and eventually accepted.

I may have missed a journal or two in this list that we submitted to. All this took some 1.5 years and three or so complete rewrites. The rewrites may have improved it a bit, but I don’t feel that what happened to us was fair or justified. I had heard that at a Bayesian summer school people were using this tutorial and recommending it to each other. I still don’t understand why this paper was so hard to publish, especially since it actually teaches a very useful skill. But it’s in press now. So it’s official folks. It’s now worth reading. It has passed peer review.

Reply ↓
- Bob Carpenter on July 19, 2016 5:54 PM at 5:54 pm said:
  
  I’d suggest writing a book. People read those and they’re relatively easy to get published.
  
  P.S. I removed the duplicate comment.
  
  Reply ↓
  - Shravan on July 20, 2016 1:03 AM at 1:03 am said:
    
    I just entered a two year sabbatical from teaching, I may well do something like that. But it’s going to be hard to top McElreath’s or Kruschke’s books if it’s for non-statisticians. There are a lot of people who only need to know one thing, how to fit linear mixed models; they don’t want to wade through a book. For them this short tutorial article might be useful.
    
    PS Somehow I have a hard time getting my students in my statistics classes to even read my lecture notes, let alone books. I have to constantly say things like, “Please remember that you are allowed to read up on this, you don’t just need to rely on my lectures and your notes taken in class.”
    
    Reply ↓
    - Rahul on July 20, 2016 3:09 AM at 3:09 am said:
      
      A short book maybe? :)
      
      Most books can say what’s really their point in 20% of the pages they use. The rest is just redundant information, explained better elsewhere. Maybe there’s pressure from the publisher to add pages so that buyers won’t feel ripped off?
      
      I think there’s a untapped market for short books, 50-100 pages that will be sold at approx. a $15-$20 price point.
    - Shravan on July 20, 2016 3:36 AM at 3:36 am said:
      
      Yeah, I know what you mean. The BDA3 book is impossible to carry physically on a vacation with you, and the Kindle edition is unreadable because each page renders ultra-slowly on the computer screen, and it can’t even be read on a Kindle (surprisingly, seeing as it’s a Kindle book!). So I ended up not reading the whole of the BDA3 book. The Kruschke book comes as a pdf, but it just crashed my pdf viewer, it was 700+ pages and there was some bug in the book or the viewer. As a result I didn’t read the whole book.
      
      My ideal is to write using something like Rbookdown, make the book free but also print editions if one wants them. No publisher seems to want to do that (except O’Reilly with Hadley Wickham’s books). I have a book contract with Cambridge Uni Press (different topic) and they refused to let me put up the book for free the way they allowed it for Mackay’s Information Theory book. Maybe you have to be famous to be allowed to put up your book for free while also having print editions. That means I am doomed to either put up free pdfs and get no publication credit for my books, or to write books that cost actual money.
      
      I may do something like that. What’s a good title? Some relevant clickbait titles could be:
      
      – The Signal and the Noise Revisited: How to publish noise as signal in top journals, Tips and tricks from the professional statistician’s perspective
      – P-hacking for Professionals: How to analyze your data so that nobody knows what actually happened
      – How to Fix Null Results: What to do when things go south and you want to go north
      – How to Lie with Statistics Without Actually Lying: Tips and tricks on using conversational implicature and clever wording to hide the dirty reality of your data
      
      by Shravan Vasishth, PhD
    - Rahul on July 20, 2016 3:46 AM at 3:46 am said:
      
      Go with a Publisher & “leak” out pdf copies. So long as you retain plausible deniability. :)
    - Shravan on July 20, 2016 3:50 AM at 3:50 am said:
      
      Good idea. I can blame that scientist from Eastern Europe who created sci-hub.
    - Bob Carpenter on July 20, 2016 8:43 PM at 8:43 pm said:
      
      We are releasing free pdfs on the up and up with the Stan books we are working on with Chapman and Hall. Cambridge University Press also allows free pdfs in some cases.
    - Shravan on July 21, 2016 2:57 AM at 2:57 am said:
      
      Bob, I saw on your blog that you release reviewer comments publicly. Is this a violation of confidentiality? I’m asking because I have been also thinking of discussing the things reviewers say publicly, but I have always been unsure whether I am violating an implicit agreement never to talk publicly about the paper and the review. Of course, I don’t need to care, I am a full professor with tenure; what can anyone do to me? But it’s more a question of ethics; if I have agreed to something implicitly, can I violate that agreement? I was curious about what others think about releasing reviews publicly. One useful purpose they would serve is to show the younger generation some examples of what happens in others’ papers, to create some more general awareness of the norms. Right now students learn this from their advisors’ hard earned experience.
    - Oliver on July 21, 2016 4:09 AM at 4:09 am said:
      
      If your are interested in what others think about open reviews, you might find this survey interesting: https://rolfzwaan.blogspot.de/2016/04/the-open-review-survey-first-peek.html
    - elin on July 20, 2016 8:44 PM at 8:44 pm said:
      
      I agree with this, I used to use a lot of the Sage “Little Green Books” for intermediate courses . https://us.sagepub.com/en-us/nam/QASS.
- elin on July 19, 2016 8:06 PM at 8:06 pm said:
  
  Wow Shravan, that is really fantastic.
  
  Reply ↓
  - Shravan on July 20, 2016 12:58 AM at 12:58 am said:
    
    Thanks Elin!
    
    PS Also, thanks to Andrew for linking to this paper.
    
    Reply ↓
- Martha (Smith) on July 20, 2016 12:25 AM at 12:25 am said:
  
  Unfortunately, the system is indeed neither fair or justified, in lost of ways. The “rules” have become codified as “That’s The Way We’ve Always Done It.” That’s not a good justification.
  
  Reply ↓
  - Martha (Smith) on July 20, 2016 12:28 AM at 12:28 am said:
    
    Oops. Should be “… neither fair nor justified” and “in lots of ways” (although “lost” does indeed describe some of the ways things are currently done.)
    
    Reply ↓
  - Shravan on July 20, 2016 1:08 AM at 1:08 am said:
    
    Peer review does usually improve the paper, I have to admit that. It’s just too costly for students as things stand. In my university (Potsdam) you can do a paper based dissertation (called a cumulative diss). This means having three papers published or accepted (with a subset in review) by the time you defend. Currently this is very difficult to achieve in 3 years, going from nothing to doing the research and publishing, if waiting times are 1.5 years (which is starting to feel standard to me now; I think I have six papers stuck like this right now).
    
    Reply ↓
- Keith O'Rourke on July 20, 2016 7:55 AM at 7:55 am said:
  
  Shravan:
  
  Some editors and reviewers take into consideration how an accepted publication in their journal might credential the authors more than they might _deserve_.
  
  This seemed to be the case in one of my past reviews “Although your paper has some fascinating points concerning, in particular, likelihood visualizations as part of an appropriate data analysis and modeling/inferential strategy, it does not provide a sufficient computational nor graphical contribution/advance to justify publication in ****.” The next journal submitted to had a similar concern about the lack of technical innovation. Both were correct in their assessment – what I was suggesting did not require new technical developments.
  
  “So [especially] hard to publish, especially since it [only] actually teaches a very useful skill” described what I was trying to publish to a tee. Maybe sometime, when I have [a lot] more time, I might revisit the material but likely following Bob suggestion possibly using Bookdown (by RStudio).
  
  Reply ↓
- Rahul on July 20, 2016 8:25 AM at 8:25 am said:
  
  It would be a fun project to try to train a classifier than can identify Journal based on the text of a paper.
  
  I wonder how well we could do.
  
  Reply ↓
  - Shravan on July 20, 2016 9:49 AM at 9:49 am said:
    
    Hey, this is a cool idea. I will assign this as a project to the next computationally inclined student coming in for a master’s.
    
    Reply ↓
    - Rahul on July 20, 2016 12:31 PM at 12:31 pm said:
      
      Allowing training to use the order & titles of Section Headings & the reference list formats should already help a lot. Add in paper length & keyword list and you ought to make a lot of progress.
      
      I wonder what other features one might extract to train on.
    - Shravan on July 21, 2016 1:58 AM at 1:58 am said:
      
      Another cool application would be to find the appropriate reviewers for a paper.
- Shravan on October 22, 2016 1:45 AM at 1:45 am said:
  
  This paper has finally been published. See here.
  
  Reply ↓
- Conrad Zygmont on February 8, 2017 3:13 PM at 3:13 pm said:
  
  I was really interested in your post and am busy developing my understanding of Bayesian statistics and so will definitely read your paper in the morning.
  
  I had virtually exactly the same experience as you did with a paper on EFA. I also went through a long process of bouncing from journal to journal (I also submitted to PM, and I think JMP, and also psychometrica and one or other additional journals) and received good feedback from some of the journals but no-one was interested in publishing it. I really felt like the paper would be of value to someone, so I didn’t give up and then got a very positive response from Quantitative Methods for Psychology. The cherry on the top was when shortly after publication I received an email from a reader who had benefited from the paper, and another from someone who wanted help with their analysis. Being one of many academics that is largely self-taught when it comes to advanced statistics, I really see the value in these kinds of tutorials and wish more academic journals were willing to publish them.
  
  Thanks for the great work, and thanks Andrew for always posting such interesting content!
  
  Reply ↓
David J. Harris on July 20, 2016 1:33 AM at 1:33 am said:

It took me a while to find the paper, since there isn’t a direct link to it. For anyone that’s interested, the direct link is https://github.com/vasishth/BayesLMMTutorial/raw/master/doc/SorensenEtAl.pdf

Reply ↓
- Shravan on July 20, 2016 3:14 AM at 3:14 am said:
  
  Oh, sorry. I will fix this right away and provide a direct link to the pdf.
  
  Reply ↓
  - Shravan on July 20, 2016 3:21 AM at 3:21 am said:
    
    Fixed this. Sorry; I keep forgetting github/bitbucket is not standard for everyone! :)
    
    Reply ↓
Mike on July 20, 2016 1:29 PM at 1:29 pm said:

Can’t say I’m a fan of all the implicit uniform priors on the variance parameters. Seems dangerous to teach folks these models without talking about the importance of using at least somewhat informative priors.

Reply ↓
- Andrew on July 20, 2016 1:41 PM at 1:41 pm said:
  
  Mike:
  
  I agree, and I take much of the blame for this, as we mostly use uniform priors on hyperparameters in our books. Since writing those books, I’ve changed my views and have become much more convinced of the value of informative priors.
  
  Reply ↓
- Shravan on July 21, 2016 3:13 AM at 3:13 am said:
  
  Good point. We discuss this in footnote 3:
  
  “This is an example of an improper prior, which is not a probability distribution. Although all the improper priors used in this tutorial produce posteriors which are probability distributions, this is not true in general, and care should be taken in using improper priors (Gelman, 2006). In the present case, a Cauchy prior truncated to have a lower bound of 0 could alternatively be defined for the standard deviation. For example code using such a prior, see the KBStan vignette in the RePsychLing package (Baayen, Bates, Kliegl, & Vasishth, 2015).”
  
  In practice, we always do a sensitivity analysis with different priors, and for linguists and the like we have a more entry level discussion in this review, where we also discuss it in more detail:
  http://www.ling.uni-potsdam.de/~vasishth/pdfs/StatMethLingPart2ArXiv.pdf
  
  In the kind of data we deal with (eyetracking, reading data), in practice the choice of the prior on the variance parameters doesn’t make any difference. But this is of course not going to be true in general.
  
  Reply ↓
  - Shravan on July 21, 2016 3:14 AM at 3:14 am said:
    
    In the paper ee also point the reader to Doug Bates et al’s package RePsychLing, where we use non-uniform priors (if I remember correctly):
    
    https://github.com/dmbates/RePsychLing
    
    Look at the KBStan.Rmd vignette.
    
    Reply ↓
    - Shravan on July 21, 2016 3:16 AM at 3:16 am said:
      
      ee -> we

Statistical Modeling, Causal Inference, and Social Science

Bayesian Linear Mixed Models using Stan: A tutorial for psychologists, linguists, and cognitive scientists

33 thoughts on “Bayesian Linear Mixed Models using Stan: A tutorial for psychologists, linguists, and cognitive scientists”

Leave a Reply Cancel reply