More coronavirus research: Using Stan to fit differential equation models in epidemiology

Posted on April 2, 2020 11:23 AM by Andrew

Seth Flaxman and others at Imperial College London are using Stan to model coronavirus progression; see here (and I’ve heard they plan to fix the horrible graphs!) and this Github page.

They also pointed us to this article from December 2019, Contemporary statistical inference for infectious disease models using Stan, by Anastasia Chatzilena et al. I guess this particular paper will be useful for people getting started in this area, or for epidemiologists who’ve been hearing about Stan and would like to know how to use it for differential equation models in epidemiology. I have not read the article in detail.

We’re also doing some research on how to do inference for differential equations more efficiently in Stan. Nothing ready to report here, but new things will come soon, I hope. One idea is to run the differential equation solver on a coarser time scale in the NUTS updating and, use importance sampling to correct the errors, and then run the solver on the finer time scale in the generated quantities block.

6 thoughts on “More coronavirus research: Using Stan to fit differential equation models in epidemiology”

Bob Carpenter on April 2, 2020 11:47 AM at 11:47 am said:

The most promising avenue of research for speeding up ODE solving in Stan of which I’m aware is adding adjoint methods, which are the equivalent of reverse-mode autodiff for ODE sensitivities. But if a model needs to solve lots of ODEs and you have access to a cluster, then either MPI or multi-threading map functions work well. Those should be getting much cleaner syntactically in the next release or the one after (mid-April or mid-July according to the current quarterly release schedule).

The differential equation solvers in Stan are controlled by setting the points at which you want solutions and error tolerances. You can indirectly make them take larger steps by reducing error tolerance. We’ve found in the past that tolerances below 1e-4 or 1e-5 if I’m remembering correctly caused problems in the PK/PD models we were fitting. What would we then do in the generated quantities block to make up for the fact that our parameters are off?

Reply ↓
More Anonymous on April 2, 2020 3:21 PM at 3:21 pm said:

Fantastic paper from the Imperial College of London team and perfect timing for its release. Here’s some of their abstract, which explains what they did:

…many European countries have implemented unprecedented non-pharmaceutical interventions including case isolation, the closure of schools and universities, banning of mass gatherings and/or public events, and most recently, widescale social distancing including local and national lockdowns. In this report, we use a semi-mechanistic Bayesian hierarchical model to attempt to infer the impact of these interventions across 11 European countries.

They find that the non-pharmaceutical interventions have greatly reduced the effective reproduction number, Rt, but it is not yet clear if the interventions have done enough to reduce Rt below 1. (If Rt falls below 1, the epidemic will begin to die off.) They also find that the non-pharmaceutical interventions have prevented a huge number of deaths from COVID-19 to date.

It would be valuable to have a similar analysis for US states. I’m not sure whether enough deaths have accrued in enough US states for the analysis to be informative yet, but when they do it should be straightforward to apply this to the US. It would be a great way to communicate
the effectiveness of non-pharmaceutical interventions in Washington, New York, etc. to government officials in states that have not experienced many COVID-19 deaths yet.

Reply ↓
Peter Chapman on April 2, 2020 4:13 PM at 4:13 pm said:

My initial thoughts on the paper. It’s not my area of expertise so happy to be corrected on all of these points. I have read the paper and get the concept of the model but have not spent any time digging down into the details. If I had attended a presentation by the authors these are some of the issues that I would like to tease out during the discussion:

• Seems an overly complex, heavily parameterised, model for what appears to me to be a straightforward set of data.
• Number of deaths is modelled as a negative binomial. Not sure what the justification is for this.
• Analysis results in an estimate of total number infected in each country at each date. The data alone does not contain necessary information to allow this, so some big assumptions are being made. Not sure what they are. (The numbers known to be infected from testing will be much lower than the real number infected, and in any case vary greatly amongst countries because of different testing regimes – e.g. Germany clearly carrying out far more tests than UK, giving superficial, but probably false, impression that death rate in Germany is much lower than in UK).
• Paper gives estimates of lives saved by lock down policies. Not sure how this works, given that time to death from initial infection seems to be about 20 days, but no country had been in lock down that long by 28 March (i.e. deaths on 28 March, and earlier, would have resulted from infections prior to lock down, so lock down would not have saved any deaths by 28 March)

Reply ↓
- Brent Hutto on April 2, 2020 4:57 PM at 4:57 pm said:
  
  Not to mention that some portion of the most vulnerable people whose lives were predicted as “saved” will still end up infected later and possibly die. Flattening the curve through lockdown does not actually cure the spread of the virus, just forestalls a large portion of infections into the future. Hopefully far into the future but that’s an unknown.
  
  Some portion of those lives “saved” will end up being people who die six months or a year from now rather than next month. But impossible to model that kind of follow-on scenario at this point.
  
  Reply ↓
  - Eric on April 3, 2020 7:44 PM at 7:44 pm said:
    
    Yes, but it also definitely saves patients who would otherwise get less care because of finite resources. As it is, all doctors right now are triaging patient care and delaying non-urgent care to minimize exposure and resource usage. I’m not operating on patients who might want elective surgery right now so that I save the masks and resources for an urgent/emergent surgery for another patient of mine in 2 months time.
    
    A surge of COVID-19 cases in the hospital now will definitely cause many of my patients to go permanently (bilaterally) blind because a stressed system will not have the capacity for the 1-2 urgent cases I do each week. (I’m an ophthalmologist.)
    
    Even if the infections aren’t pushed off *far* into the future, if they’re long enough to prevent too much stress on the system, it can save unrelated people’s lives and livelihoods.
    
    Reply ↓
Dan F. on April 3, 2020 9:12 AM at 9:12 am said:

I like the 95% credible interval of 3.7-41 for the percent of Spain that has been infected (on p. 6).

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

More coronavirus research: Using Stan to fit differential equation models in epidemiology

6 thoughts on “More coronavirus research: Using Stan to fit differential equation models in epidemiology”

Leave a Reply Cancel reply