Skip to content

“Partially Identified Stan Model of COVID-19 Spread”

Robert Kubinec writes:

I am working with a team collecting government responses to the coronavirus epidemic. As part of that, I’ve designed a Stan time-varying latent variable model of COVID-19 spread that only uses observed tests and cases. I show while it is impossible to know the true number of infected cases, we can rank/sign identify the effects of government policies on the virus spread. I do some preliminary analysis with the dates of emergency declarations of US states to show that states which declared earlier seem to have lower total infection rates (though they have not yet flattened the infection curve).

Furthermore, by incorporating informative priors from SEIR/SIR models, it is possible to identify the scale of the latent variable and provide more informative estimates of total infected. These estimates (conditional on a lower bound based on SIR/SEIR models) report that approximately 700,000 Americans have been infected as of yesterday, or roughly 6-7 times the observed case count, as many SEIR/SIR models have predicted.

I’m emailing you as I would love feedback on the model as well as to share it with others who may be engaged in similar modeling tasks.

Paper link

Github with Data & Stan code


  1. Ben Vigoda says:

    We are looking for a few more collaborators to help build a machine learning data science online app to help prevent COVID-19 and manage social distancing at a local level and on a weekly basis.

    Problem we want to solve

    To turn parts of society back on, state and local governments need to know when it is safe for people to decrease their social distancing, zipcode by zipcode, week by week. This will be needed until we have antivirals, vaccines, and/or herd immunity – probably at least for much of 2020.

    Proposed solution

    Provide an online “safe map” – an online covid advisory to enable local leaders and communities to easily understand the progress of COVID-19 in their area on a daily or weekly basis. Use the safe map to keep R<1 everywhere, always.

    We will utilize a range of data streams including Facebook's new Disease Prevention Maps. At first we will not be trying to make complex long-range predictions, but as we iterate we may write and train/estimate more complex models.


    Current collaborators include London School of Hygiene and Tropical Medicine, Columbia CS and Statistics Departments, Facebook Disease Prevention Maps, and professional data science engineers from multiple networks including the Open Data Science Conference (ODSC).

    Here is a link to more detailed project description,

    and a google sheet where people can join the project.

    All the best,
    Ben Vigoda

    MIT PhD, Post-Doctoral Fellow, Visiting Scientist, National Academy of Sciences Kavli Fellow, World Economic FOrum Technology Pioneer.
    CEO Gamalon, Inc. (An advanced machine learning company)

  2. Rich Haney says:

    I would be interested in contributing to development and testing some of the underlying statistical models. I’m a retired statistician/applied mathematician with a background on mortality modeling, pharmacometrics, financial risk models and other areas.

    I could chip in on the development of testing on models using R, STAN, MATLAB, or SAS; I have had a niche in models that use differential equations. I am interested, for example, in models that use different kinds of nonlinear terms ( e.g., terms for densities).

    In my experience, success in this area requires running thousands of models, and I can also work on practical aspects of that problem.

    Rich Haney

Leave a Reply