Here’s a long and thoughtful article on issues that have come up with Covid modeling.
- Jordana Cepelewicz. 2021. The Hard Lessons of Modeling the Coronavirus Pandemic. Quanta.
Jordana’s a staff writer for Quanta, a popular science magazine funded by the Simons Foundation, which also funds the Flatiron Institute, where I now work. She’s a science reporter, not a statistician or machine learning specialist. A lot of Simons Foundation funding goes to community outreach and math and science education. Quanta aims to be more like the old Scientific American than the new Scientific American; but it also has the more personal angle of a New Yorker article on science (my favorite source of articles on science because the writing is so darn good).
There’s also a film that goes along with Jordana’s article:
- Quanta. 2021. Why COVID-19 Models Don’t Predict the Future. YouTube.
I found the comments on YouTube fascinating. Not as off the wall as replies to newspaper articles, but not the informed stats readership of this (Andrew’s) blog, either.
“we respond to the model’s predictions and prove them wrong”
To the extent that our behavior changes the outcome of a sequence of events, humans are mostly reacting to information obtained through the many channels we have to understand our world, and almost certainly *not* reacting to scientific model projections. :)
An interesting mishmash of misdirection and excuses.
The onset of a pandemic is largely characterised by a single parameter, the doubling rate. This is clearly identifiable in numerous data sets. The modellers simply didn’t do the elementary task of estimating this parameter and as a result (in the UK at least) grossly underestimated the problem through Feb and March. This doesn’t require advanced modelling skills, it takes nothing more than a spreadsheet or even a bit of easy mental arithmetic. Just fit a line on a log plot. That’s literally all it takes, and adequate data were readily available.
This isn’t the only mistake they made, but it’s probably the most glaring one.
Responding to this comment, not to the original post.
I can assure you that I and all of the epidemic modelers I have ever interacted with do indeed know how to estimate a doubling rate. Can you please provide some evidence for your claim? What kind of mistakes do you think modelers made in estimating the doubling rate? And what is your suggested method for dealing with temporal changes in doubling rate due to changes in testing rate, changes in testing bias, changes in fraction susceptible (including effects of population heterogeneity) and changes in behaviour over time due to non-pharmaceutical interventions and individual choices?
Documented in detail in the minutes of the UK’s SAGE meetings, as outlined on my blog.
https://julesandjames.blogspot.com/2020/04/blueskiesresearchorguk-5-day-doubling.html
https://julesandjames.blogspot.com/2020/07/patrick-vallances-faulty-memory.html
The mistake they made was to assume a doubling rate of about 5 days (which was never more than a very rough estimate from early Chinese data) and reject the mountain of evidence of a much more rapid doubling that was apparent in multiple European countries including the UK by early/mid March. This was a direct cause of the delayed lockdown in the UK that cost tens of thousands of lives in the first wave.
At a quick glance, your criticisms look reasonable to me. I can also see now that I misread your comment as “did it wrong/didn’t know how to do it right” vs “simply didn’t do” (which is what you actually said): sorry for jumping to incorrect conclusions.
A couple of other points:
* while r (growth rate) and generation interval determine R (growth per generation), there are some subtleties in the translation (depends on the shape of the GI, not just its mean: https://royalsocietypublishing.org/doi/10.1098/rsif.2020.0144 ). It could remove some confusion if people focused more on r, which is more directly connected to the observed growth of the epidemic …
* it’s kind of ironic that Ferguson gets hammered both (reasonably) by you for being overly optimistic by failing to calibrate appropriately and (ridiculously) for being too pessimistic (I see that yours is the first comment on the epic comment thread at https://statmodeling.stat.columbia.edu/2020/05/08/so-the-real-scandal-is-why-did-anyone-ever-listen-to-this-guy/)
“did it wrong/didn’t know how to do it right” vs “simply didn’t do”
I don’t see the difference. If you didn’t do something that should have been done, you did what you did do wrong. Right? :)
I like this bit from Bloomberg via Marginal Revolution:
https://marginalrevolution.com/marginalrevolution/2021/02/profile-of-youyang-gu.html
It’s great that the Big Med Research eventually converged on the answer. But it’s disappointing that they needed a data scientist with absolutely zero experience in epidemiology to point the way. It seems more and more that, above a certain level of general understanding, subject expertise has relatively little benefit.
Isn’t this too simplistic?
The doubling rate isn’t some static parameter. Identifying the rate right now is hardly the problem. It’s about predicting how the doubling rate will change in the future that’s the issue.
If the doubling rate were some static parameter like the half life of radioactive decay then life would be easy. But it isn’t!
It certainly gets more complicated once we start to change our behaviour. At the outset, however, before people were doing anything on a large scale, the doubling rate was very stable and this is the primary determinant of the timing of the outbreak (and also strongly related to the height of the peak in the absence of mitigation). The lack of urgency in the UK response up to late March was directly based on the advice that the peak was still a couple of months off……even as Italian hospitals were overflowing…
Of course there’s always some uncertainty, but the log plots (of both cases, and deaths) were all impressively close to straight lines with similar slopes, substantially steeper than the 5-6 day range that the UK modellers were fixated on.
https://www.medrxiv.org/content/10.1101/2020.04.04.20050427v2
https://www.medrxiv.org/content/10.1101/2020.04.14.20065227v2.full.pdf
Is the doubling rate a fundamental parameter or is it the expression of some other fundamental parameter?
I saw in interview with Michael Osterholm the other day. While he missed the boat on masks, and although he’s not a modeler, his qualitative predictions have been pretty accurate. His view is that the progression of the pandemic from last winter until now is more or less what would be expected based on seasonal effects of corona viruses in general and other factors such as the emergence of various mutations.
A lot has been lost by focusing on the quantitative instead of the qualitative.
Can you clarify what you mean by a “fundamental parameter”? This is epidemiology (= virology + immunology + ecology + evolution + sociology + political science), so it’s hard to know what would be “fundamental”. The nice thing about the doubling rate is that it’s an easily quantifiable number that can give you reasonable short-term predictions.
A “fundamental parameter” is variable that measures a distinct natural process – as opposed to a number that’s an index or rough representation of a group of unknown processes.
Has anyone looked at Youyang Gu’s prediction? I’ve stopped tracking covid prediction a long time ago, but this guy seems to be getting some positive attention lately.
Hmmm apparently he’s started things up again, he had paused earlier. At the point in time when he said he was going to stop updating his model was doing quite well.