The European Commission just released their Proposal for a Regulation on a European approach for Artificial Intelligence. They finally get around to a definition of “AI” on page 60 of the report (link above): â€˜artificial intelligence systemâ€™ (AI system) means software that is developed with one or more of the techniques and approaches listed in […]

## The Folk Theorem, revisited

It’s time to review the folk theorem, an old saw on this blog, on the Stan forums, and in all of Andrew’s and my applied modeling. Folk Theorem Andrew uses “folk” in the sense of being folksy as opposed to rigorous. The Folk Theorem of Statistical Computing (Gelman 2008): When you have computational problems, often […]

## Summer research jobs at Flatiron Institute

If you’re an undergrad or grad student and work in applied math, stats, or machine learning, you may be interested in our summer research assistant and associate positions at the Flatiron Institute’s Center for Computational Mathematics: Scientific computing summer positions Machine learning and statistics summer positions There is no deadline, but we’ll start reviewing applications […]

## Jordana Cepelewicz on “The Hard Lessons of Modeling the Coronavirus Pandemic”

Here’s a long and thoughtful article on issues that have come up with Covid modeling. Jordana Cepelewicz. 2021. The Hard Lessons of Modeling the Coronavirus Pandemic. Quanta. Jordana’s a staff writer for Quanta, a popular science magazine funded by the Simons Foundation, which also funds the Flatiron Institute, where I now work. She’s a science […]

## How many infectious people are likely to show up at an event?

Stephen Kissler and Yonatan Grad launched a Shiny app, Effective SARS-CoV-2 test sensitivity, to help you answer the question, How many infectious people are likely to show up to an event, given a screening test administered n days prior to the event? Here’s a screenshot. The app is based on some modeling they did with […]

## How to describe Pfizer’s beta(0.7, 1) prior on vaccine effect?

Now it’s time for some statistical semantics. Specifically, how do we describe the prior that Pfizer is using for their COVID-19 study? Here’s a link to the report. A PHASE 1/2/3, PLACEBO-CONTROLLED, RANDOMIZED, OBSERVER-BLIND, DOSE-FINDING STUDY TO EVALUATE THE SAFETY, TOLERABILITY, IMMUNOGENICITY, AND EFFICACY OF SARS-COV-2 RNA VACCINE CANDIDATES AGAINST COVID-19 IN HEALTHY INDIVIDUALS Way […]

## UX issues around voting

While Andrew’s worrying about how to measure calibration and sharpness on small N probabilistic predictions, let’s consider some computer and cognitive science issues around voting. How well do elections measure individual voter intent? What is the probability that a voter who tries to vote has their intended votes across the ballot registered? Spoiler alert. It’s […]

*The Economist* not hedging the predictions

Andrew’s hedge that he’s predicting vote intent and not accounting for any voting irregularities either never made it to the editorial staff at The Economist or they chose to ignore it. Their headline still reports that they’re predicting the electoral college, not voter intent: Predicting voter intent is a largely academic exercise in that all […]

## Hiring at all levels at Flatiron Institute’s Center for Computational Mathematics

We’re hiring at all levels at my new academic home, the Center for Computational Mathematics (CCM) at the Flatiron Insitute in New York City. We’re going to start reviewing applications January 1, 2021. A lot of hiring We’re hoping to hire many people for each of the job ads. The plan is to grow CCM […]

## More on absolute error vs. relative error in Monte Carlo methods

This came up again in a discussion from someone asking if we can use Stan to evaluate arbitrary integrals. The integral I was given was the following: where the -ball is assumed to be in dimensions so that . (MC)MC approach The textbook Monte Carlo approach (Markov chain or plain old) to evaluating such an […]

## Probabilities for action and resistance in Blades in the Dark

Later this week, I’m going to be GM-ing my first session of Blades in the Dark, a role-playing game designed by John Harper. We’ve already assembled a crew of scoundrels in Session 0 and set the first score. Unlike most of the other games I’ve run, I’ve never played Blades in the Dark, I’ve only […]

## Drunk-under-the-lamppost testing

Edit: Glancing over this again, it struck me that the title may be interpreted as being mean. Sorry about that. It wasn’t my intent. I was trying to be constructive and I really like that analogy. The original post is mostly reasonable other than on this one point that I thought was important to call […]

## Super-duper online matrix derivative calculator vs. the matrix normal (for Stan)

I’m implementing the matrix normal distribution for Stan, which provides a multivariate density for a matrix with covariance factored into row and column covariances. The motivation A new colleague of mine at Flatiron’s Center for Comp Bio, Jamie Morton, is using the matrix normal to model the ocean biome. A few years ago, folks in […]

## Make Andrew happy with one simple ggplot trick

By default, ggplot expands the space above and below the x-axis (and to the left and right of the y-axis). Andrew has made it pretty clear that he thinks the x axis should be drawn at y = 0. To remove the extra space around the axes when you have continuous (not discrete or log […]

## Prior predictive, posterior predictive, and cross-validation as graphical models

I just wrote up a bunch of chapters for the Stan user’s guide on prior predictive checks, posterior predictive checks, cross-validation, decision analysis, poststratification (with the obligatory multilevel regression up front), and even bootstrap (which has a surprisingly elegant formulation in Stan now that we have RNGs in trnasformed data). Andrew then urged me to […]

## Naming conventions for variables, functions, etc.

The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future. Three principles of naming follow: 1. Names should mean something. 2. Names should be as short as possible. 3. Use your judgement to balance (1) and (2). The third […]

## Is data science a discipline?

Jeannette Wing, director of the Columbia Data Science Institute, sent along this link to this featured story (their phrase) on their web site. Is data science a discipline? Data science is a field of study: one can get a degree in data science, get a job as a data scientist, and get funded to do […]

## What can we do with complex numbers in Stan?

I’m wrapping up support for complex number types in the Stan math library. Now I’m wondering what we can do with complex numbers in statistical models. Functions operating in the complex domain The initial plan is to add some matrix functions that use complex numbers internally: fast fourier transforms asymmetric eigendecomposition Schur decomposition The eigendecomposition […]

## A normalizing flow by any other name

Another week, another nice survey paper from Google. This time: Papamakarios, G., Nalisnick, E., Rezende, D.J., Mohamed, S. and Lakshminarayanan, B., 2019. Normalizing Flows for Probabilistic Modeling and Inference. arXiv 1912.02762. What’s a normalizing flow? A normalizing flow is a change of variables. Just like you learned way back in calculus and linear algebra. Normalizing […]

## Monte Carlo and winning the lottery

Suppose I want to estimate my chances of winning the lottery by buying a ticket every day. That is, I want to do a pure Monte Carlo estimate of my probability of winning. How long will it take before I have an estimate that’s within 10% of the true value? It’ll take… There’s a big […]