I (Bob, not Andrew!) will be doing a meetup talk this coming Thursday in New York City. Here’s the link with registration and location and time details (summary: pizza unboxing at 6:30 pm in SoHo): Bayesian Data Analysis Meetup: Under the hood: Stan’s library, language, and algorithms After summarizing what Stan does, this talk will […]

**Statistical computing**category.

## Reproducibility and Stan

Aki prepared these slides which cover a series of topics, starting with notebooks, open code, and reproducibility of code in R and Stan; then simulation-based calibration of algorithms; then model averaging and prediction. Lots to think about here: there are many aspects to reproducible analysis and computation in statistics.

## How we should they carry out repeated cross-validation? They would like a third expert opinion…”

Someone writes: I’m a postdoc studying scientific reproducibility. I have a machine learning question that I desperately need your help with. . . . I’m trying to predict whether a study can be successfully replicated (DV), from the texts in the original published article. Our hypothesis is that language contains useful signals in distinguishing reproducible […]

## “Do you have any recommendations for useful priors when datasets are small?”

A statistician who works in the pharmaceutical industry writes: I just read your paper (with Dan Simpson and Mike Betancourt) “The Prior Can Often Only Be Understood in the Context of the Likelihood” and I find it refreshing to read that “the practical utility of a prior distribution within a given analysis then depends critically […]

## Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs!

Here’s the official announcement: The Institute’s Postdoctoral and Research Scientists will help anchor Columbia’s presence as a leader in data-science research and applications and serve as resident experts in fostering collaborations with the world-class faculty across all schools at Columbia University. They will also help guide, plan and execute data-science research, applications and technological innovations […]

## Postdocs and Research fellows for combining probabilistic programming, simulators and interactive AI

Here’s a great opportunity for those interested in probabilistic programming and workflows for Bayesian data analysis: We (including me, Aki) are looking for outstanding postdoctoral researchers and research fellows to work for a new exciting project in the crossroads of probabilistic programming, simulator-based inference and user interfaces. You will have an opportunity to work with […]

## “Simulations are not scalable but theory is scalable”

Eren Metin Elçi writes: I just watched this video the value of theory in applied fields (like statistics), it really resonated with my previous research experiences in statistical physics and on the interplay between randomised perfect sampling algorithms and Markov Chain mixing as well as my current perspective on the status quo of deep learning. […]

## Stan development in RStudio

Check this out! RStudio now has special features for Stan: – Improved, context-aware autocompletion for Stan files and chunks – A document outline, which allows for easy navigation between Stan code blocks – Inline diagnostics, which help to find issues while you develop your Stan model – The ability to interrupt Stan parallel workers launched […]

## Limitations of “Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection”

“If you will believe in your heart and confess with your lips, surely you will be saved one day” – The Mountain Goats paraphrasing Romans 10:9 One of the weird things about working with people a lot is that it doesn’t always translate into multiple opportunities to see them talk. I’m pretty sure the only […]

## Why are functional programming languages so popular in the programming languages community?

Matthijs Vákár writes: Re the popularity of functional programming and Church-style languages in the programming languages community: there is a strong sentiment in that community that functional programming provides important high-level primitives that make it easy to write correct programs. This is because functional code tends to be very short and easy to reason about […]

## Using Stacking to Average Bayesian Predictive Distributions (with Discussion)

I’ve posted on this paper (by Yuling Yao, Aki Vehtari, Daniel Simpson, and myself) before, but now the final version has been published, along with a bunch of interesting discussions and our rejoinder. This has been an important project for me, as it answers a question that’s been bugging me for over 20 years (since […]

## “Dynamically Rescaled Hamiltonian Monte Carlo for Bayesian Hierarchical Models”

Aki points us to this paper by Tore Selland Kleppe, which begins: Dynamically rescaled Hamiltonian Monte Carlo (DRHMC) is introduced as a computationally fast and easily implemented method for performing full Bayesian analysis in hierarchical statistical models. The method relies on introducing a modified parameterisation so that the re-parameterised target distribution has close to constant […]

## A.I. parity with the West in 2020

Someone just sent me a link to an editorial by Ken Church, in the journal Natural Language Engineering (who knew that journal was still going? I’d have thought open access would’ve killed it). The abstract of Church’s column says of China, There is a bold government plan for AI with specific milestones for parity with […]

## StanCon Helsinki streaming live now (and tomorrow)

We’re streaming live right now! Thursday 08:45-17:30: YouTube Link Friday 09:00-17:00: YouTube Link Timezone is Eastern European Summer Time (EEST) +0300 UTC Here’s a link to the full program [link fixed]. There have already been some great talks and they’ll all be posted with slides and runnable source code after the conference on the Stan […]

## Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs

Andrew suggested I cross-post these from the Stan forums to his blog, so here goes. Maximum marginal likelihood and posterior approximations with Monte Carlo expectation maximization: I unpack the goal of max marginal likelihood and approximate Bayes with MMAP and Laplace approximations. I then go through the basic EM algorithm (with a traditional analytic example […]

## Continuous tempering through path sampling

Yuling prepared this poster summarizing our recent work on path sampling using a continuous joint distribution. The method is really cool and represents a real advance over what Xiao-Li and I were doing in our 1998 paper. It’s still gonna have problems in high or even moderate dimensions, and ultimately I think we’re gonna need […]

## Thanks, NVIDIA

Andrew and I both received a note like this from NVIDIA: We have reviewed your NVIDIA GPU Grant Request and are happy support your work with the donation of (1) Titan Xp to support your research. Thanks! In case other people are interested, NVIDA’s GPU grant program provides ways for faculty or research scientists to […]

## Awesome MCMC animation site by Chi Feng! On Github!

Sean Talts and Bob Carpenter pointed us to this awesome MCMC animation site by Chi Feng. For instance, here’s NUTS on a banana-shaped density. This is indeed super-cool, and maybe there’s a way to connect these with Stan/ShinyStan/Bayesplot so as to automatically make movies of Stan model fits. This would be great, both to help […]

## Where do I learn about log_sum_exp, log1p, lccdf, and other numerical analysis tricks?

Richard McElreath inquires: I was helping a colleague recently fix his MATLAB code by using log_sum_exp and log1m tricks. The natural question he had was, “where do you learn this stuff?” I checked Numerical Recipes, but the statistical parts are actually pretty thin (at least in my 1994 edition). Do you know of any books/papers […]

## Divisibility in statistics: Where is it needed?

The basics of Bayesian inference is p(parameters|data) proportional to p(parameters)*p(data|parameters). And, for predictions, p(predictions|data) = integral_parameters p(predictions|parameters,data)*p(parameters|data). In these expressions (and the corresponding simpler versions for maximum likelihood), “parameters” and “data” are unitary objects. Yes, it can be helpful to think of the parameter objects as being a list or vector of individual parameters; and […]