Skip to content
Archive of entries posted by

Royal Society spam & more

Just a rant about spam (and more spam) from pay-to-publish and closed-access journals. Nothing much to see here. The latest offender is from something called the “Royal Society.” I don’t even know to which king or queen this particular society owes allegiance, because they have a .org URL. Exercising their royal prerogative, they created an […]

Beautiful paper on HMMs and derivatives

I’ve been talking to Michael Betancourt and Charles Margossian about implementing analytic derivatives for HMMs in Stan to reduce memory overhead and increase speed. For now, one has to implement the forward algorithm in the Stan program and let Stan autodiff through it. I worked out the adjoint method (aka reverse-mode autodiff) derivatives of the […]

Macbook Pro (16″ 2019) quick review

I just upgraded yesterday to one of the new 2019 Macbook Pro 16″ models: Macbook Pro (16″, 2019), 3072 x 1920 pixel display, 2.4 GHz 8-core i9, 64GB 2667 MHz DDR4 memory, 2880 x 1800 pixel display, AMD Radeon Pro 5500M GPU with 4GB of GDDR6 memory, 1 TB solid-state drive US$4120 list including Apple […]

Field goal kicking—like putting in 3D with oblong balls

Putting Andrew Gelman (the author of most posts on this blog, but not this one), recently published a Stan case study on golf putting [link fixed] that uses a bit of geometry to build a regression-type model based on angles and force. Field-goal kicking In American football, there’s also a play called a “field goal.” […]

Econometrics postdoc and computational statistics postdoc openings here in the Stan group at Columbia

Andrew and I are looking to hire two postdocs to join the Stan group at Columbia starting January 2020. I want to emphasize that these are postdoc positions, not programmer positions. So while each position has a practical focus, our broader goal is to carry out high-impact, practical research that pushes the frontier of what’s […]

Non-randomly missing data is hard, or why weights won’t solve your survey problems and you need to think generatively

Throw this onto the big pile of stats problems that are a lot more subtle than they seem at first glance. This all started when Lauren pointed me at the post Another way to see why mixed models in survey data are hard on Thomas Lumley’s blog. Part of the problem is all the jargon […]

All the names for hierarchical and multilevel modeling

The title Data Analysis Using Regression and Multilevel/Hierarchical Models hints at the problem, which is that there are a lot of names for models with hierarchical structure. Ways of saying “hierarchical model” hierarchical model a multilevel model with a single nested hierarchy (note my nod to Quine’s “Two Dogmas” with circular references) multilevel model a […]

Calibration and sharpness?

I really liked this paper, and am curious what other people think before I base a grant application around applying Stan to this problem in a machine-learning context. Gneiting, T., Balabdaoui, F., & Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(2), 243–268. Gneiting […]

Seeking postdoc (or contractor) for next generation Stan language research and development

The Stan group at Columbia is looking to hire a postdoc* to work on the next generation compiler for the Stan open-source probabilistic programming language. Ideally, a candidate will bring language development experience and also have research interests in a related field such as programming languages, applied statistics, numerical analysis, or statistical computation. The language […]

Why does my academic lab keep growing?

Andrew, Breck, and I are struggling with the Stan group funding at Columbia just like most small groups in academia. The short story is that to apply for enough grants to give us a decent chance of making payroll in the following year, we have to apply for so many that our expected amount of […]

Software release strategies

Scheduled release strategy Stan’s moved to a scheduled release strategy where we’ll simply release whatever we have every three months. The Stan 2.20 release just went out last week. So you can expect Stan 2.21 in three months. Our core releases include the math library, the language compiler, and CmdStan. That requires us to keep […]

AnnoNLP conference on data coding for natural language processing

This workshop should be really interesting: Aggregating and analysing crowdsourced annotations for NLP EMNLP Workshop. November 3–4, 2019. Hong Kong. Silviu Paun and Dirk Hovy are co-organizing it. They’re very organized and know this area as well as anyone. I’m on the program committee, but won’t be able to attend. I really like the problem […]

Peter Ellis on Forecasting Antipodal Elections with Stan

I liked this intro to Peter Ellis from Rob J. Hyndman’s talk announcement: He [Peter Ellis] started forecasting elections in New Zealand as a way to learn how to use Stan, and the hobby has stuck with him since he moved back to Australia in late 2018. You may remember Peter from my previous post […]

Stan examples in Harezlak, Ruppert and Wand (2018) Semiparametric Regression with R

I saw earlier drafts of this when it was in preparation and they were great. Jarek Harezlak, David Ruppert and Matt P. Wand. 2018. Semiparametric Regression with R. UseR! Series. Springer. I particularly like the careful evaluation of variational approaches. I also very much like that it’s packed with visualizations and largely based on worked […]

StanCon 2019: 20–23 August, Cambridge, UK

It’s official. This year’s StanCon is in Cambridge. For details, see StanCon 2019 Home Page What can you expect? There will be two days of tutorials at all levels and two days of invited and submitted talks. The previous three StanCons (NYC 2017, Asilomar 2018, Helsinki 2018) were wonderful experiences for both their content and […]

Ben Lambert. 2018. A Student’s Guide to Bayesian Statistics.

Ben Goodrich, in a Stan forums survey of Stan video lectures, points us to the following book, which introduces Bayes, HMC, and Stan: Ben Lambert. 2018. A Student’s Guide to Bayesian Statistics. SAGE Publications. If Ben Goodrich is recommending it, it’s bound to be good. Amazon reviewers seem to really like it, too. You may […]

(Markov chain) Monte Carlo doesn’t “explore the posterior”

[Edit: (1) There’s nothing dependent on Markov chain—the argument applies to any Monte Carlo method in high dimensions. (2) No, (MC)MC is not not broken.] First some background, then the bad news, and finally the good news. Spoiler alert: The bad news is that exploring the posterior is intractable; the good news is that we […]

Book reading at Ann Arbor Meetup on Monday night: Probability and Statistics: a simulation-based introduction

The Talk I’m going to be previewing the book I’m in the process of writing at the Ann Arbor R meetup on Monday. Here are the details, including the working title: Probability and Statistics: a simulation-based introduction Bob Carpenter Monday, February 18, 2019 Ann Arbor SPARK, 330 East Liberty St, Ann Arbor I’ve been to […]

Google on Responsible AI Practices

Great and beautifully written advice for any data science setting: Google. Responsible AI Practices. Enjoy.

NYC Meetup Thursday: Under the hood: Stan’s library, language, and algorithms

I (Bob, not Andrew!) will be doing a meetup talk this coming Thursday in New York City. Here’s the link with registration and location and time details (summary: pizza unboxing at 6:30 pm in SoHo): Bayesian Data Analysis Meetup: Under the hood: Stan’s library, language, and algorithms After summarizing what Stan does, this talk will […]