New report on coronavirus trends: “the epidemic is not under control in much of the US . . . factors modulating transmission such as rapid testing, contact tracing and behavioural precautions are crucial to offset the rise of transmission associated with loosening of social distancing . . .”

Juliette Unwin et al. write:

We model the epidemics in the US at the state-level, using publicly available death data within a Bayesian hierarchical semi-mechanistic framework. For each state, we estimate the time-varying reproduction number (the average number of secondary infections caused by an infected person), the number of individuals that have been infected and the number of individuals that are currently infectious. We use changes in mobility as a proxy for the impact that NPIs and other behaviour changes have on the rate of transmission of SARS-CoV-2. We project the impact of future increases in mobility, assuming that the relationship between mobility and disease transmission remains constant. We do not address the potential effect of additional behavioural changes or interventions, such as increased mask-wearing or testing and tracing strategies.

Nationally, our estimates show that the percentage of individuals that have been infected is 4.1% [3.7%-4.5%], with wide variation between states. For all states, even for the worst affected states, we estimate that less than a quarter of the population has been infected; in New York, for example, we estimate that 16.6% [12.8%-21.6%] of individuals have been infected to date. Our attack rates for New York are in line with those from recent serological studies [1] broadly supporting our modelling choices.

There is variation in the initial reproduction number, which is likely due to a range of factors; we find a strong association between the initial reproduction number with both population density (measured at the state level) and the chronological date when 10 cumulative deaths occurred (a crude estimate of the date of locally sustained transmission).

Our estimates suggest that the epidemic is not under control in much of the US: as of 17 May 2020, the reproduction number is above the critical threshold (1.0) in 24 [95% CI: 20-30] states. Higher reproduction numbers are geographically clustered in the South and Midwest, where epidemics are still developing, while we estimate lower reproduction numbers in states that have already suffered high COVID-19 mortality (such as the Northeast). These estimates suggest that caution must be taken in loosening current restrictions if effective additional measures are not put in place.

We predict that increased mobility following relaxation of social distancing will lead to resurgence of transmission, keeping all else constant. We predict that deaths over the next two-month period could exceed current cumulative deaths by greater than two-fold, if the relationship between mobility and transmission remains unchanged. Our results suggest that factors modulating transmission such as rapid testing, contact tracing and behavioural precautions are crucial to offset the rise of transmission associated with loosening of social distancing.

Overall, we show that while all US states have substantially reduced their reproduction numbers, we find no evidence that any state is approaching herd immunity or that its epidemic is close to over.

One question I have is about the assumptions underlying “increased mobility following relaxation of social distancing.” Even if formal social distancing rules are relaxed, if the death rate continues, won’t enough people be scared enough that they’ll limit their exposure, thus reducing the rate of transmission? This is not to suggest that the epidemic will go away, just that maybe people’s behavior will keep the infections spreading at something like the current rate? Or maybe I’m missing something here.

The report and other information is at their website.

Unwin writes:

Below is our usual three panel plot showing our results for the five states we uses as a case study in the report – we chose them because we felt they showed different responses across the US. New in this report, we estimate the number of people who are currently infectious over time – the difference in this and those getting newly infected each day is quite stark.

We have also put the report on open review, which is an online platform enabling open reviews of scientific papers. It’s usually used for computer science conferences but Seth Flaxman has been in touch to partner with them to try it for a pre-print. If you’d like, click on the link and you can leave us a comment or recommend a reviewer.

Lots and lots of graphs (they even followed some of my suggestions, but I’m still concerned about the way that the upper ends of the uncertainty bounds are so visually prominent), and they fit a multilevel model in Stan, which I really think is the right way to go, as it allows a flexible workflow for model building, checking, and improvement.

You can make of the conclusions what you will: the model is transparent, so you should be able to map back from inferences to assumptions.

130 thoughts on “New report on coronavirus trends: “the epidemic is not under control in much of the US . . . factors modulating transmission such as rapid testing, contact tracing and behavioural precautions are crucial to offset the rise of transmission associated with loosening of social distancing . . .”

  1. These graphs are very nice to look at, and even with my poor color vision I can distinguish the different areas without a problem. Thanks!

    The third column shows that the R0 value has leveled off at nearly one for many states. You can actually tell that from the data without needing to do any calculation, because the number of new cases per day is nearly constant for those states. This leads me to ask: how could that possibly happen? Why is it that an R0 of that special value is so widespread in the US? Why not e.g. 0.5 or 1.6?

    There would seem to be two possibilities:

    1. There is some kind of feedback effect going on;
    2. The degree of control that has been achieved in the US – movement, masks, distancing, store closures, etc. just happens to lead to a value near 1 across a wide range of culturally different states.

    I’m having trouble seeing either of these as all that plausible.

    • Figure 4 in the paper (21 May 2020, pg 8) shows quite a homogenous spread of R between ~0.7 and ~1.3, so I don’t think the value 1.0 is special in any way.

      A possible obvious feedback effect is “state tightens social distancing measures until cases stop growing alarmingly”.

      P.S.: Your “degree of control” list omits testing and contact tracing.

    • Tom said,
      “These graphs are very nice to look at, and even with my poor color vision I can distinguish the different areas without a problem. Thanks!”

      Ah, vision can vary so much from individual to individual! The graphs following “Below is our usual three panel plot ” are fine for my vision. But the ones at the top, with the two shades of violet, send my eyes and brain reeling. Kind of psychedelic for me.

  2. I sometimes wonder whether epidemiologists use the word “predict” in a way that differs from its standard meaning. Most of the models I’ve seen assume a sort of frictionless environment where psychology doesn’t play much of a role. I don’t know whether it’s true, but I always thought epidemiology was more concerned with describing the past than predicting the future. Their models (or, again, the few I’ve seen) certainly look consistent with that idea.

    (Note that I’m not blaming epidemiologists for this: I think we sometimes need idealizing assumptions. But words like “predict” could certainly lead people to believe they’re claiming something when they’re not)

    • They’re predicting “if xyz then pdq” (or in the example case, “pdq if xyz”) which is a responsible form of prediction. For example:

      “We predict that deaths over the next two-month period could exceed current cumulative deaths by greater than two-fold, if the relationship between mobility and transmission remains unchanged.”

      In public discourse the condition will immediately be dropped, and then politically motivated attack drones on twitter will claim that they predicted something that didn’t come true and therefore are incompetent.

      • I understand your point, but I guess for me the condition is too idealizing to use the “predict” in the present context. I think if you asked, “Do you *really* think the outcome you predict could actually occur, a lot of time the answer would be “no”’: people will adjust their behavior and/or governments would act long before that (and here I’m thinking about the x million deaths “prediction”).

        Again, I’m not really disputing your point, but I think the term is just unhelpful outside of people with a certain degree of knowledge about what is being claimed.

        • Groups such as this blog should understand that when they predict what will happen, it’s conditional on some particularly assumptions carrying forward in time.

          Without a world where we agree to do that, you can’t have academic discussion. You need to be able to discuss the methodology of the prediction independent of the assumptions. Because if your predictions aren’t even reasonable if the assumptions hold, then your model is wrong, and it doesn’t matter how much you goof around with the assumptions.

          Once you have a model with reasonable predictive ability, you can move on to trying to adjust the assumptions to be reasonable under plausible future scenarios.

          But when things get politicized, all that goes out the window. No one cares about science anymore, and it’s all about denigrating the “other tribe”. One major mechanism for doing that is to assert that people made predictions that they didn’t make (ie. unconditional predictions) and that the predictions didn’t hold, and therefore the models are all wrong. The models can be totally fine, and the assumptions about behavior etc failed to hold, and the predictions will diverge and *should* diverge. This isn’t evidence of incompetence it’s evidence that the model is sensitive to the things that were different between the assumptions and the reality.

          It’s far too subtle a point for the twitterverse etc though.

        • I’m sure that people here know what is meant. It’s a question of what happens when the discussion moves to the public realm. I had lots of people telling me in panic at the time that scientists were predicting 6 million people were going to die. I told them that if they actually read what the report said, they would realize it wasn’t a prediction on the sense they thought it was (the report even said that the condition was implausible). But people don’t read and the press doesn’t report the d

          I don’t know what is the best way of communicating scientific information to the general public in a pandemic. I just wish another way then ”we predict . . ” when the assumptions are psychologically implausible.

        • Where have you seen the press not reporting the underlying assumptions? It seems like that is where you should focus your concern.

        • Here, for instance?

          https://www.google.fi/amp/s/amp.theguardian.com/commentisfree/2020/mar/22/the-guardian-view-on-the-coronavirus-crisis-much-worse-is-to-come

          “ Given expert predictions that the total number of UK deaths could be 250,000, the priority is to save lives.”

          Let me clear, my argument is merely that words like “prediction” can be counterproductive in the long run if they are heard by people who don’t really understand models. And I understand the need for idealizing assumptions. As I said, I don’t have a solution. I guess adding plausible changes in behavior/mobility after X number of deaths is too difficult.

        • I don’t see much concerning about that article. The ICL study projected 500k deaths with no mitigation, 250k deaths with the UK’s mitigation policy in place at the point the study was published. Not clear when the article was published (I don’t see a byline), but it appears it was during the period after the study was published and while the govt was deciding on a new approach.

          In any case, if you think the article is misleading, I think your issue is with the journalist, not the epidemiologists.

        • Joseph said,
          “Where have you seen the press not reporting the underlying assumptions? It seems like that is where you should focus your concern.”

          I think that there is also a responsibility of scientists to report their findings in ways that do more than reporting underlying assumptions. One possibility that comes to mind is to say something like, “predictions in different scenarios”, and repeatedly point out the different scenarios in their conclusions — e.g., things like, “If we do x, y and z, then our models predict that …; but if we only do x, then the models predict that …”. (I agree it is hard to figure out how best to do this, but I think we can do better than we currently do.)

        • Joseph said, “In any case, if you think the article is misleading, I think your issue is with the journalist, not the epidemiologists.”

          As my comment above suggests (I hope), I think that the epidemiologists (and other scientists) bear the major part of the responsibility for communicating clearly with the journalists — and by “communicating clearly”, I mean (among other things) being mindful that the journalists (and the general public) do not “speak the same language” as we do. In other words, scientists need to realize that they have an “ongoing” responsibility to educate the journalists and the general public (e.g., in all their interactions with journalists). One phrase I try to keep in mind when teaching is that “telling is not teaching”. We need to remember that admonition and remember to take it into account when communicating with journalists and the general public.

        • @Martha

          In my reading of the 5/16 Imperial College London Study (the source of the 250k estimate cited by Joe above) and all the interviews I’ve watched of Ferguson (lead author of the study), I’ve found it very clear what key assumptions were in place for what predictions.

          Have you found it to be otherwise?

        • Joe,

          I agree with your concern. They’re attaching an unlikely condition to the prediction that allows them to make a more ominous-sounding prediction. It’s fine that people in some circles will be aware of the conditions but because the conditions are unlikely, it’s still kind of a false prediction, made more for the purposes of attention seeking and headline grabbing then for science.

          Why don’t we all just gin up some unlikely conditions to make “predictions” that are unlikely to come true?

  3. Andrew –

    > if formal social distancing rules are relaxed, if the death rate continues, won’t enough people be scared enough that they’ll limit their exposure, thus reducing the rate of transmission?

    “People” is a pretty broad term there. Some probably will. Probably some won’t and reach a kind of status quo. And still others will see it as a badge of honor to not wear masks, not limit exposure, etc. and in fact increase their levels of infectious behaviors largely in inverse proportion to he extent to which the media report on increasing infections.
    And it is likely to be a regional distinction, to some degree.

    I’d say it’s pretty hard, at this point, to conjecture that there will be a significant change in any particular direction in direct response to levels of infectiousness. In this country, at least. At least as long as the recurrent president remains in place.

    • Indeed, this is why all predictions must be conditional… “if abc happens then also def will happen”… Because there’s no way to predict which abc will happen since all the abcs have to do with how people choose to behave, and that’s informed by all kinds of weirdness like the fact that huge numbers of disinformation troll-bots are tweeting “reopen”

      https://news.slashdot.org/story/20/05/21/2147232/nearly-half-of-twitter-accounts-pushing-to-reopen-america-may-be-bots

      • That article deserves a wow.

        And we can’t even predict what’s going to happen if there is an effective vaccine that can be manufactured and distributes on a massive scale.

        I like to think that the anti-vaxx stuff will drop off to a large degree, but as it exists now there is evidence that a large % of the public will active resist taking, and even the promotion of, a vaccine.

        I’m always reluctant to have confidence in identifying society-wide paradigm shifts – but today’s media environment really does feel feel different. The whole “Plandemic” movement and the significant overlap with elements of the dominant political party, suggests a lot of things that can go in ways that would previously seemed to me unimaginable.

        • It deserves a wow, and yet it doesn’t. We need to get to the place where people acknowledge as routine that Twitter is full of bots and most of what is going on over there is driven by info-warfare.

        • Also, lots of the people on twitter aren’t bots but act like bots, as with a couple of the commenters who we had to deal with on this blog during the past week or two.

          We used to talk about trolls, but these people or bots aren’t trolls, exactly. Trolls want to attract attention and get you to respond; these people are trying to steer the discussion by flooding the zone with bad-faith arguments. And, yes, I know this is not new, and I know that it follows the pattern of talk radio etc. I guess I’m more sensitized to it now, having seen it happen in our comments section.

        • This is the problem with training people to overrely on argument from authority/consensus heuristics “for their own good”. It can be easily hijacked by someone else who is not “good”.

        • Compare:

          Nullius in verba (Latin for “on the word of no one” or “take nobody’s word for it”) is the motto of the Royal Society.

          Now when I go to the royal society webpage the top article is about “trust in science”. https://royalsociety.org/

          This is the opposite of what it should be. Science is based on distrust of both others and yourself, not trust.

        • >Science is based on distrust of both others and yourself, not trust.

          Good idea. I went to their website and did not see a top article or an article about “trust in science”.
          ;-)

        • jd said,
          ” I went to their website and did not see a top article or an article about “trust in science”.

          Scroll down just a little bit from the top of the linked page — there is something saying
          “BLOG
          Following the science: Venki Ramskrishnan, President of the Royal Society, addresses trust in science during the coronavirus (COVI-19) pandemic”

          I clicked on the BLOG link. The first two paragraphs if the blog entry read,
          “Evidence-based decision making should absolutely be a cornerstone of government, especially in a pandemic for which science is of paramount importance to our response. However, we must recognise both the potential and the limits of science. In an emergency, data for decisions may be uncertain, incomplete, or even missing. Nevertheless, rapid decisions have to be made. Science advice is only one aspect of this. Ministers also need to consider economics, ways of implementation, and broad consequences to society, and they need to be able to take the public with them.

          At the frontiers of science, there is always uncertainty, and to pretend otherwise would be foolish. What science does is to try to gather evidence to reduce the uncertainty, but this happens only gradually as data are gathered and hypotheses tested and discarded until some idea of the truth emerges. But even those “truths” can fall by the wayside in the face of new and contradictory evidence. The entire process is based on honesty, openness and transparency, in which the evidence is published for all to see and argue about. It is no coincidence that scientists are highly trusted. ”

          This seems consistent with the tenor of the blog we are. commenting in.

    • Joshua said,
      ““People” is a pretty broad term there. Some probably will. Probably some won’t and reach a kind of status quo. And still others will see it as a badge of honor to not wear masks, not limit exposure, etc. and in fact increase their levels of infectious behaviors largely in inverse proportion to he extent to which the media report on increasing infections.
      And it is likely to be a regional distinction, to some degree.”

      +1

    • +1

      For what it’s worth, I live in Michigan’s upper peninsula which has just opened as part of a phased plan her in the state. Based on my n=1 observation, we are already seeing an unprecedented influx of tourists from downstate this holiday weekend which I imagine will not abate. Usually tourists do not arrive until later in June, but there was already a record long wait to get over the bridge and hotels are filling up. There’s are a sizeable group of people who are determined to take their regular summer vacation in spite of the virus,

      The R0 up here was lower and more stable from the outset compared with the lower peninsula. I assume the R0 reached somewhat of a policy-induced floor during the lock down which can only trend upwards as local mobility increases and people come in large numbers from downstate and other states such as Illinois and Wisconsin. However, I suppose this depends on local factors too. For instance, many restaurants are still not opening for dine-in here and residents are limiting exposure. Will this counteract the influx of people from the hot spots downstate? Hard to say.

      I would hazard that even if we don’t see major changes in transmission at the state level, there will greater changes in transmission within-states or across state boundaries in regional corridors that experience high summer travel. Until some sort of new equilibrium is reached.

  4. > Even if formal social distancing rules are relaxed, if the death rate continues, won’t enough people be scared enough that they’ll limit their exposure, thus reducing the rate of transmission?

    I think this sentence works without ‘if the death rate continues’. Having that in there seems to imply there’s this thing called death rate that people are tracking and it is the thing driving their decision making, and that is not clear.

    But still your point that at an individual level people will still behave defensively seems reasonable. But schools/businesses reopening presumably are relevant here, and those decisions aren’t so much made at the individual level.

    > This is not to suggest that the epidemic will go away, just that maybe people’s behavior will keep the infections spreading at something like the current rate?

    I certainly don’t know. I’d guess re-opening will have an effect comparable to some large fraction of the closing effect, whatever that was. So I’d think if closing changed things, then opening will change things, and assuming that opening won’t change things is either saying something like:

    1. The closing effect came from mostly from individual decision making (which we assume remains defensive, regardless of openings)
    2. The original closing effect was small

    I don’t think we think those things.

    • I have seen mobility data, gathered from cell phones, that indicates that people were already decreasing their mobility at least a week before any of the shelter-in-place orders were given, and that the downward trend in mobility did *not* accelerate after those orders but just continued on its same course for a while.

      • Someone on this blog pointed out when those data were published that they used the date of the official stay-home order as the start of the lockdown, but that in many states the closing of schools happened a week or more earlier. Once the schools have closed, a whole lot of adults have to stay home too.

    • I think the significance of individual vs. company (going to work-from-home etc.) vs. local government vs. state government decisions, and the size of the closing effect, is going to vary a lot between different parts of the US.

      Where I live, the state-wide order didn’t change much; first schools closed and companies went to work-from-home, then the city/county governments did orders, well before the state order.

      In some highly rural areas, the closing effect may not have been that large in the first place. Some European countries seem to be leaning toward the idea that schools are not a major problem because kids are mostly near-immune – *if* this is true, then some rural interior US states that didn’t do stay-at-home / shelter-in-place orders and close all “nonessential” businesses may not have changed very much at all. (Depending on how many mass gatherings e.g. South Dakota or Arkansas has in a normal spring – I’m not really sure…)

  5. Andrew asks: “One question I have is about the assumptions underlying “increased mobility following relaxation of social distancing.” Even if formal social distancing rules are relaxed, if the death rate continues, won’t enough people be scared enough that they’ll limit their exposure, thus reducing the rate of transmission?”

    Maybe or the opposite may happen. Sitting here in NYC, I heard that constant sirens at the height of the epidemic. But, in much of the country, people live in quite segregated communities. My fear is that people living in all of these relatively isolated towns my not even be aware that there is an out break in the next town over and they will relax social distancing. Local news isn’t what it use to be. The national news and media is dominated by NYC and California. We could be looking at lots of little outbreaks throughout the summer as the virus spreads through little towns and then comes back to major population centers in the fall for an even worse out break.

    • Steve said,
      “Local news isn’t what it use to be. The national news and media is dominated by NYC and California.”

      Very good point. I don’t think we take this into account as much as we should.

    • I agree the media gives a largely coastal/urban picture.

      I tend to think, though, that most rural areas are probably not going to be hit all that hard, with the exception of communities around meatpacking plants/prisons. South Dakota’s epidemic doesn’t seem to be spreading terribly easily outside the Sioux Falls area,

      That could certainly change. But right now this seems to be largely a “disease of dense populations”. The hard-hit areas: Wuhan, Lombardy, Madrid, NYC/New Jersey, Detroit… are very dense by US standards.

      Brazil might change this, or might not – the cities are very dense, but I don’t know how many of the cases/deaths are coming from the cities vs. more rural areas.

    • Sorry, learning the ropes of commenting on the blog. Above post was meant in reply to me earlier comment.

      Steve said: “People live in quite segregated communities.” How true this is especially in metro Detroit. Combine this segregation with transmission and death rates that, although very high, are still low and heterogeneous enough and the result is that for many people the disease still exists largely in the abstract. For many people in the suburbs, experience with the disease is indirect–removed multiple degrees (a family member, friend of an acquaintance, etc.)

  6. Thanks for the post, and the openness of the approach!

    “We project the impact of future increases in mobility, assuming that the relationship between mobility and disease transmission remains constant”

    That assumption seems very questionable, even separate from the points Andrew and others have raised about people limiting their mobility on their own. The link between “mobility” and transmission isn’t a simple one. Going to large indoor gatherings (concerts, etc.) is very different than going for a hike, though traveling to each counts as “mobility.” I see lots of indications of doing the latter (which I think is great) and zero for restarting the former. Similarly, gatherings involving the elderly are different than those for the general population, which I think is perhaps beginning to sink in — note that 80% of Covid-19 deaths in Canada are in long term care facilities! It’s very hard for me to imagine that the mobility-transmission relationship in June will be anything like the mobility-transmission link for March.

    • You’re talking about sort of “personal” mobility. And that’s a good point. But in normal times, most weekday mobility is driven by a morning commute to work, lunch at restaurants and cafeterias, a commute home, and dinner at restaurants, also dropping off and picking up kids at schools. If people return to that pattern then things could get ugly.

      • True, but even “normal” weekday mobility is *very* different — commuting to a workplace that used to have large in-person meetings plus working in an office is different than presently commuting to a workplace that only has people work in offices and prohibits all gatherings, though both would show up the same in measures of “mobility.”

        For work as well as “personal” activities, I find it very hard to imagine that the mobility-transmission relationship isn’t radically altered, so much so that thinking it isn’t is flawed even as a zeroth-order approximation. Certainly in all work/school contexts I’m aware of, it’s *very* different (for better or for worse…).

        • Indeed, they explicitly state this in the article as well, from page 16:

          “Our results also do not account for behavioural changes that may occur such as increased mask wearing or changes in age specific movement. Therefore,our scenarios are pessimistic in nature and should be interpreted as such”

    • Yeah, I really don’t think the mobility/transmission relationship will stay the same, especially since there’s some evidence that COVID transmission is way more driven by super-spreader events than, say, flu. (If the R0 is 2 or 3, but one person at a choir practice infects 50 people, then there have to be a lot of infected people that don’t infect anyone to bring the average down…)

      So there may be a “tail” effect where a small percentage of activities cause most of the spread.

      Other problems:
      – if the R0 is near 1, small inaccuracies can make huge differences in results (growing vs. shrinking) over time. Since many US states seem to be near 1, I am extremely skeptical of anybody’s predictions at this point (not that there is necessarily anything wrong with the models, I’m just not sure the data is good enough to distinguish … garbage in/garbage out).

      – What about seasonality? Even if the effect is small, it could matter a lot if the R0 is near 1…

  7. One thing I think is worth mentioning about this report is how it provides evidence for something I’d been hypothesizing, in the form of the graphs in Figure 1 (mobility graphs using Google data)

    https://www.imperial.ac.uk/media/imperial-college/medicine/mrc-gida/2020-05-21-COVID19-Report-23.pdf

    They have conveniently placed circle, triangle and square symbols along the top of the graphs to show the dates on which various states issued stay at home orders… You can see that many of the states where people have been saying things like “this state acted late but they didn’t really have a big problem” Such as say FL, TX, GA etc put in official orders WEEKS after people had already long ago taken action to reduce their movements. In fact one of the striking things is that the entire country seemed to follow the same pattern of movement restriction starting between Mar 9 and Mar 23 with a typical time around Mar 16, even though plenty of places didn’t place official restrictions until last week of march through second week of April.

    This says *yes* people *will* restrict their motions on their own… At least in face of reports about what was going on in Italy and NYC. At the moment, with things sort of stable, I don’t think we’re likely to get self restrictions until we have a second wave.

    Andrew, the big problem with *reactionary* restrictions where people see a bad event and then restrict their movements, is the lag time between spreading the disease and seeing the downstream consequences. it seems to be that cases are ascertained about 7 to 10 days after infection, and hospital events are 15 to 21 days after infection, while growth rates can be as high as doubling every 3 days. Even with more restricted activity, you could easily imagine doubling every say 7 days, and by the time we’re seeing an upswing in hospitalization, even if we restrict activity, there will be many many people sick (8x as many as 3 weeks earlier when things seemed ok and people started moving around.

    We need surveillance, so we can detect growth in cases within a week rather than 3 weeks.

    • Your point about timing is very good – but in Texas, while the *state* order came late, school districts had acted very early, many companies had gone to telework, etc. And some cities acted before the state. So it’s not just state vs. individual action, or even government vs. individual (since many companies that were capable of telework acted before they were required to).

      • Good points. But still things to think about — e.g., will “social distancing fatigue” set in and cause lapses? (Admittedly, that could be very hard to model!)

        • Social distancing fatigue – quite possibly, but hard to distinguish its effects from the effects of re-opening policies.

          I have hopes that seasonality will counter any effects from fatigue + reopening, and also that the things that are being re-opened were not major drivers of the risk anyway. (If it’s largely driven by super-spreader events, the only thing that really worries me that is being re-opened now are churches, because that’s often a *lot* of people. Things like hair salons, not so much. If – if! – most infected people don’t infect someone else and the fairly high R0 is because some percentage of people are super-spreaders who infect tons… then if there aren’t large groups that R0 will drop a lot).

        • Montgomery Alabama hit ICU capacity today… I think lots of rural places simply hadn’t had much introduction when the country effectively shut down as mobility dropped. Now that they’re re-opening (AL begin reopening May 11) they’re seeing their first wave finally begin to hit.

        • I don’t really know what is going on there – seeing somewhat contradictory news stories. (I also don’t know how Montgomery’s ICU capacity per capita compares to other parts of the US.)

          But in any case, I doubt it’s all that indicative of the larger picture. Several other states in the South reopened well before Alabama and haven’t seen much trouble.

          I’d also be surprised if Montgomery didn’t see introductions early – being a state capital on two interstates, it can’t be *that* remote. If there were really cases in Florida in January (as Florida news has reported), I can’t really see how it wasn’t all through the South.

        • All the indications are that the distribution of number of people infected by a reference case is extremely right tailed, with 0 being the mode. Differential equation models have the behavior that they under-estimate extinctions by a LOT because they work based entirely on the average of the distribution. Agent based models show that when you have these long tailed distributions you can repeatedly get lucky when the infection numbers are small… if in some region there were 10 or 15 people who were sick… a few weeks later it could be zero with reasonably high probability, especially in rural areas where contact rates are low. But when you finally kindle that fire big enough that extinction is not likely, then you get a real conflagration. I think that’s what’s going to happen over the next few months throughout the states where things never got running.

          I don’t have county by county data or anything, but looking at AL, they notched up on April 20 and then have steadily increasing cases per day. Their hospitalizations have been steadily and linearly increasing the whole time. Basically they’re in trouble.

        • >>All the indications are that the distribution of number of people infected by a reference case is extremely right tailed, with 0 being the mode.

          I think this is true, but I disagree on the implications for (at least many) rural areas in the US.

          I think this will mean that there will continue to be outbreaks in high-risk locations (meatpacking plants, prisons etc.) but they mostly won’t spread beyond the immediate community very effectively.

          If Montgomery, or Alabama as a whole, is in real trouble (I’m not saying they’re not – just that I’ve seen somewhat-contradictory stories and don’t know enough about the area to judge) it might be because they are in a dangerous “middle zone”, not quite low-density enough.

        • Hair salons do worry me. They seem to be places where a single highly contagious person could generate a lot of infections. I intend to continue to cut my hair myself until infection rates die down to practically zero.

        • I think this is right. Hair salons involve plenty of time in close contact, and usually a lot of talking. I think if people want to open hair salons they should be doing stuff outside, or with every door and window open, and clean up and bleach the floors and countertops and everything every hour between customers. The indoor closed space and length of contact all worry me.

        • Hmm, yeah, a worker there could infect quite a few people before they became symptomatic if there weren’t precautions. I was just thinking that there aren’t many people in the building at one time… but the overall number of contacts during the asymptomatic period is probably more important.

        • As should every element of commerce which can conceivably re-located outside to plain air. Including: grocery pickup! Large parking lots can be used efficiently for this purpose. Morale would improve substantially among staff! And the salutary improvement in commerce and the economy would be not insubstantial. This idea needs to gain traction!

        • (Meant to reply to Lakeland, not to confused’s response to Lakeland)

          As should every element of commerce which can conceivably re-located outside to plain air. Including: grocery pickup! Large parking lots can be used efficiently for this purpose. Morale would improve substantially among staff! And the salutary improvement in commerce and the economy would be not insubstantial. This idea needs to gain traction!

    • > mobility graphs using Google data

      Have we had any arguments about the Google mobility data? Are these numbers good? The lack of variability in the Google mobility stuff bothers me.

      Why is every state the same? Like why does every state have that huge down-spike around April 10th-15th?

      • Why is every state the same? Because everyone was watching Italy melt down, and everyone was talking about how rapidly the infections were increasing, and everyone was reloading the JHU map several times an hour, and everyone was talking about the runs on grocery store shelves etc. Op-Eds in NYT, WaPo, The Atlantic, LA Times etc were all saying “close everything” as early as last week of Feb to first week of Mar.

        None of this was driven by local cases, it was all global news.

        Italy locked down Lombardy Mar 9, and the whole nation Mar 10. Within days everywhere in the US was plummeting in mobility. Here in SoCal the Pasadena school district closed Mar 13, the CA wide shelter order came Mar 19. I had already pulled my kids from school Mar 4, and I have emails telling my kids school that I was canceling all in school volunteer science activities as early as Feb 26. All of that was driven by watching what happened in Wuhan and Italy. Reports of people being welded into their apartment complexes out of Wuhan were coming as early as Feb 11-18

        The mobility data is consistent with data I’ve seen on restaurant sales. It might not be perfect, but I think the synchronization reflects the global news system.

        • Hmm, I mean, sure there was stuff going on in the news, but a simpler explanation is there is a large amount of pooling in however this mobility model works which gives the appearance of everyone being in sync.

          In this case it wouldn’t be good to come to any conclusions about how we all shut down together — that could just be an artifact of how they’re making the number.

          Is this data coming from Android phone owners? If so, is this adjusted to non-Android phone owners? Is mobility in a place hours spent in a place?

          Maybe Android phone owners are so homogenous as to all lower their mobility by like 20 percentage points relative to baseline on April 12th in 50 different states, but that seems really unlikely.

        • I think you’re underestimating how many decisions are made with nationwide effect.

          Company I work for (with 65,000 employees) is headquartered in NYC but has less than 5% of our employees working there. When NY State went on mandatory lockdown (March 22 if memory serves), the company went mandatory work-from-home everywhere.

          Another example – McDonald’s shut down all its in-store dining nationwide on March 16.

          This is a good point, Daniel — one I hadn’t thought through previously. That mobility data is interesting.

        • Yeah, I really don’t know how this stuff works, but it all looks surprisingly uniform.

          I looked around a little for information and found this blurb (https://www.google.com/covid19/mobility/data_documentation.html?hl=en):

          “We calculate these insights based on data from users who have opted-in to Location History for their Google Account, so the data represents a sample of our users. As with all samples, this may or may not represent the exact behavior of a wider population.”

          Like, I get it, that’s the sample, but it’s like doing polls — it’s nice when people recognize the limitations of non-representative polls, but if no efforts are made to adjust it then, well, that’s it. It’s not representative.

          Maybe it doesn’t matter if we’re going to look at this like some sort of machine-learning magic feature we plug into our regressions, but the post at the top is starting to interpret the meaning of this mobility data directly, and it’s not clear to me we should be doing that.

          There’s this which has some stuff in it: https://support.google.com/covid19-mobility/answer/9825414?hl=en&ref_topic=9822927

          “Remember that these mobility reports show relative changes, and not absolute visitors or duration. For example, if few people normally visit places of work on a Sunday, you wouldn’t expect to see large changes to Sunday”

          And that makes me think this isn’t even the covariate we’d want for our models! The baseline is changing every day, which seems weird.

          There’s an arxiv paper on this data, but it’s focused on privacy: https://arxiv.org/abs/2004.04145 , which is something, but it’s not super relevant to the statements we’re making here (people in X states stopped moving around much at the same time).

        • Oh well I guess April 12th was Easter Sunday. I guess that’s something but it isn’t very satisfying. Everyone celebrates Eastern uniformly?

    • I’m more in doubt about the progression Daniel details in the comment above. Example: our Town of 60,000 (so really a city), just complete a random testing and found 7% penetration. Other tests in MA have found 8-10%, so this is reasonable. But we have 314 confirmed cases, not over 4000. And a significant percentage of the confirmed cases are from a few care facilities. (The one around the corner from me has been reported as massively penetrated. Lots of ambulances, and every day or so emergency response shows up.) What is the progression in that?

      • Remember the false positive rates on these serology tests are variable and quite high-ish, like it’s totally normal for you to get 5% false positives, so a 7% raw rate in a survey could mean maybe 2% in the population or less. I doubt the town where you live is using the kinds of Bayesian analysis including false pos/neg that Andrew posted in the last few days.

        We’re finding survey after survey that shows 4-7% penetration, which is also just about the level you’d expect from false positives alone or at least false positives around as common as true ones. It seems to suggest really that serology can only provide us with accurate data as attack rates increase about about 15%

        • I have no idea, but this and another was designed and run by Mass General. But I don’t know how they analyzed the blood, which was what they collected. Participants were selected using randomized Town census data, and they excluded anyone who had shown virus symptoms. I understand they had a very high compliance rate.

  8. I entirely agree that people have largely limited their social interactions independent of government. Social pressure has been intense. The media spreads scare stories. Two nights ago I saw Andrew Cuomo say that a rare complication, which relatively becomes serious, is ‘more dangerous than coronavirus’, thus scaring the bleep out of every parent. Maybe he meant it’s more dangerous in a handful of cases, but that’s not what he said.

    But I think there is a growing awareness that this is not only a disease of the elderly but of those in nursing homes/care facilities. The state level data collection is shameful, but the number appear fairly consistent; over half of all deaths are from care facilities. This appears to include under-reported deaths, which is sensible because someone would notice if a bunch of 30 year olds suddenly died but no one notices extra deaths among 90 year olds in a facility. MA reports 3755 deaths in care facilities out of 6148 total or 61%. (That, again, is likely an undercount of both deaths and the percentage in care facilities.) Given that care facilites only hold abou 40k people in MA, the failure to throw resources at this group is shameful. Note that this percentage has been going up over time with better reporting.

    The initial surge of hospitalized cases appears more related to the penetration of care facilities and thus the exposure of the most vulnerable of the most vulnerable. I note that the percent of deaths 70 and over is 86%. No one under 20 has died in MA, only 8 under 30, and only 18 in their 30’s. To keep going, because the numbers remain consistent day after day, only 54 people in their 40’s have died in MA as of 5/21. The risk only becomes appreciable in the 50’s.

    As an aside, it’s unclear the extent to which being old is a risk factor. 98% of the dead have underlying conditions, whatever that means, but it’s impossible to get into it deeper because 61% come from care facilities. If I allocate those to 70’s and 80’s+, then the risk of being old drops substantially. I tend to believe nursing homes are themselve a cause: either because of multiple exposures or the artificial sterility helps it establish. But my cousin thinks it’s just really old immune systems in people who can’t care for themselves.

    To get back to social distancing, I think it’s beginning to penetrate that, scare stories aside, the risk for younger people is very low. I also think we’re in a transition state in which people will begin to realize that partial opening doesn’t pay the bills. Only a few weeks ago, people were predicting the economy would rebound fast. I did not. Very few businesses can survive on much lower revenue.

    • Forgot: when this started, I would look at deaths, cases, etc. by age. The risks were always low below age 50, but they’ve continued to drop. Under age 30, using just confirmed cases, which is now demonstably lower than actual, the risk of dying under age 30 is .03%. It was 04%. In your 40’s, it’s .06%. It’s only in the 50’s that the risk using just these numbers reaches .2%, meaning higher than flu.

        • But presumably not if you get the high-dose version of the flu vaccine that is licensed specifically for those over 65?

        • You can find a zillion descriptions of flu death rates, but I just checked current CDC data. The rate is higher for flu for groups below age 20, because the rate for Covid below age 20 is really, really low. The flu rate over ages to 49 is .02%. So you can point and say I’m a hundredth of a percent off, not even a tenth of a percent, but I think a basis point is fairly decent.

          I did not compare a decade’s risk to all flu risk.

          Now, this is MA data only. National data could be higher. One question is whether that trends toward the MA form or toward a higher form.

        • Yeah, I think there’s little question that flu is less dangerous at the youngest ages – seasonal flu has an U-shaped curve with a spike at infancy and a larger spike among the oldest, COVID doesn’t (it increases with age consistently, but sharply).

          The 1918-19 flu pandemic showed a W-shaped curve, with peaks at infancy, young adult, and elderly.

          I think 2009-10 was relatively flat.

          It would be interesting to know at what age the COVID risk “crosses over” the flu risk.

        • CDC estimates for the 2017-2018 season (the worst of the last decade)

          18-49 2.8K dead 14M cases 0.02%
          50-64 6.7K dead 13M cases 0.05%
          65+ 51K dead 6M cases 0.85% (ouch)

          overall about 0.13% died.

          CDC doesn’t split their data into clean decades, at least on the page I found.

    • > The initial surge of hospitalized cases appears more related to the penetration of care facilities and thus the exposure of the most vulnerable of the most vulnerable.

      What do you base that on, other than impression?

      Put more precisely — what statistics have you seen that show or suggest makes you think that the disease has been more prevalent in care facilities than the general population?

      • I don’t think it has to be more prevalent, for that to be true. It’s just much more *visible*, since the effects are far more severe.

        It’s more that the disease can spread for a while, but when it’s mostly infecting young more mobile people, it’s spreading fairly “silently” (few people getting sick enough to seek testing, very few deaths). When it gets into a vulnerable population, then you see a huge spike (in deaths, if it’s a nursing home, or at least in cases, in a warship/prison/meatpacking plant/etc.)

        • Then why characterize it as “the initial surge of hospitalized cases”? If it’s penetrating all groups equally, and that penetration persists, then it will be the initial surge, the secondary surge, the tertiary surge, etc.

          In other words, that will simply be how it proceeds throughout the pandemic — the most vulnerable population is the most hospitalized population.

        • Sure, there can be other surges. But “the initial surge” is when it first becomes obvious that a region has a problem (and, back in the first half of March when testing was bad, maybe even where it first becomes obvious that COVID is present in a region).

          20 Spring Breakers between 18-25 might get infected in Miami and then go back where they came from. There’s pretty good odds none of them will be hospitalized. They might infect a bunch of their friends in the same age, and you’re still not going to see many hospitalizations. Maybe one, but that doesn’t really tell you much, and back when we thought there wasn’t community spread in much of the US, it might not be recognized as COVID. But when it gets into a more vulnerable population, suddenly there’s a spike – in hospitalizations and then deaths, though maybe not infections.

        • You’re still not following. He frames it as “The initial surge of hospitalized cases appears more related to the penetration of care facilities and thus the exposure of the most vulnerable of the most vulnerable.”

          The use of “initial surge of hospitalized cases” suggests that the penetration of care facilities was overweighted among them, and that there will be fewer hospitalizations per infection in future because the “initial surge” was mainly related to the most vulnerable being exposed.

          If his point is simply that the aged and infirm are the most vulnerable, I agree with that. But that’s not a feature of just the initial surge — if we allow the infection prevalence to get to the point of herd immunity, a lot more of the vulnerable in care facilities will be hospitalized and killed. Even in hard-hit NYC, infection prevalence is only ~20%. Herd immunity will be around 80% with overshoot. So we’ve only seen about 1/4 of the infections we would see along the way to herd immunity, and presumably we’ve only see about 1/4 of the nursing home deaths we would see along the way too.

        • Oh, sorry, maybe I misunderstood.

          I thought the point was that we missed how widespread it was early on, not just because testing was bad, but also because the most “mobile” people were hospitalized less.

          >>Herd immunity will be around 80% with overshoot.

          Maybe. I’m no epidemiologist, but there have been papers/preprints saying that this is hugely dependent on the network of social contacts… the simple prediction of herd immunity at 50% for R0 = 2, 67% for R0 = 3, etc. (before overshoot) assumes the population mixes evenly / everyone has an equal number of contacts, which is clearly not true in the real world.

          High seroprevalence (60%+) has been reported (at least in the media… don’t know how reliable) from a couple of towns in northern Italy, so some places can go high, but that may not be representative of everywhere.

          I think 80% as an overall, in-general, number is really pessimistic.

          >>and presumably we’ve only see about 1/4 of the nursing home deaths we would see along the way too.

          That seems hugely dependent on how well we can protect nursing homes going forward. With the right policy, we might see very few new infections in nursing homes. (For example, in hard-hit areas, there are probably enough immune people to take care of the vulnerable people…)

  9. Where do they get figure 14 showing infectiousness of an individual over time (x axis caption also has a typo)?

    From what I’ve read people are non-infectious for a few days after exposure, then are infectious from about -2 days to 7 days after becoming symptomatic.

    Whereas the virus was readily isolated during the first week of symptoms from a considerable fraction of samples (16.66% of swabs and 83.33% of sputum samples), no isolates were obtained from samples taken after day 8 in spite of ongoing high viral loads.

    https://www.nature.com/articles/s41586-020-2196-x

    Assuming an incubation period distribution of mean 5.2 days from a separate study of early COVID-19 cases1, we inferred that infectiousness started from 2.3 days (95% CI, 0.8–3.0 days) before symptom onset and peaked at 0.7 days (95% CI, −0.2–2.0 days) before symptom onset (Fig. 1c). The estimated proportion of presymptomatic transmission (area under the curve) was 44% (95% CI, 25–69%). Infectiousness was estimated to decline quickly within 7 days.

    https://www.nature.com/articles/s41591-020-0869-5

    So this curve would then be zero for ~3 days at the beginning and have a sharper peak around 5-6 days (onset of symptoms) that is back to near zero by 12 days.

    Also, as usual I think plotting the number of cases/deaths without the number of tests performed is very misleading.

    • The given curve doesn’t seem too bad to me compared to the stuff you’re linking to. I’m pretty sure it’s something they eyeballed based on reading the kind of thing you mentioned. Or maybe chose a functional form, and then tried to fit some basic stats to the kind of thing you link to.

      Remember that it can be hard to figure out when someone was infected. And most studies aren’t going to look hour-by-hour. so If someone is infected at 5pm on monday, and swabbed on wed at 9am, is that recorded as 2 days? etc.

      • I’m pretty sure it’s something they eyeballed based on reading the kind of thing you mentioned. Or maybe chose a functional form, and then tried to fit some basic stats to the kind of thing you link to.

        Do they explain this anywhere? And from what I’ve seen the duration of infectiousness plays a huge role in how things play out in these SIR models. It seems like it is being treated like an afterthought.

        • If you look here: https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology#Bio-mathematical_deterministic_treatment_of_the_SIR_model

          You’ll see that in the simplest model: R0 = Beta/Gamma, where Beta is in dimensions of 1/time, and gamma is in dimensions of 1/time. Gamma is basically 1/duration_of_infection

          Take the basic equations as shown, and divide all of them by Gamma. Now create a dimensionless variable t* = t Gamma then you’ll have dimensionless equations as follows:

          dS/dt* = – R0 IS/N
          dI/dt* = R0 IS/N – I
          dR/dt* = I

          Since they’re estimating R0 from data, the result of having a duration that’s too long will be to have a dimensionless time t* that’s a little too short. In other words, things won’t be too big or too small, they’ll be too fast or too slow, but it’ll all be by a constant factor. That doesn’t worry me so much.

        • In other words, things won’t be too big or too small, they’ll be too fast or too slow, but it’ll all be by a constant factor. That doesn’t worry me so much.

          Wasn’t the point of the lockdown to “flatten the curve”? That is the type of thing this assumption affects.

        • If my above analysis is correct, not really. There’s really only *one* free parameter that describes the shape of the curve, R(t), and since you’re fitting R(t) to data, the data is constraining your curve rather than the assumption. If you change the assumption the fit will cause a compensating change in the coefficients of the fit to R(t) and you’ll get the same (or similar) curve.

  10. Great work, as usual with the various implementations of this COVID-19 model.

    It would be useful to run the analysis on counties instead of states, since county-level mortality data are available for the US, and since county data could help to clarify the important relationship between COVID-19 Rs and population density. Is a county-level analysis likely to be achievable by Stan, or would it be prohibitively computationally expensive?

    The authors examine the association between Rs and population density at the state level, which provides some very suggestive evidence (see their Figure 3). However, population densities are so uneven within states that a county-level analysis would be much more informative here.

    I’m also curious about robustness of results to their case seeding assumptions (see bottom of page 20).

  11. Looking at Illinois, I’m not sure the model is doing a great job (at least in Illinois!) in accounting for the number of daily tests. R_t is oscillating weekly (presumably due to big test releases on Tuesday) and a big spike in R_t corresponds to a big increase in the number of tests. Unfortunately, since each state probably has its own idiosyncrasies in test reporting, this will be difficult to get right.

    I applaud the transparency though, which allows us to do all this nitpicking :) I wish all internal models used by states were also made public.

    • Hi cozzyd, Note that the model ignores the reported counts of cases because the authors consider the information too unreliable to be worth using (for example, because of changes in test availability that you mention). Instead, the model estimates case counts from counts of reported deaths.

      The weekly oscillation in Rt probably corresponds to weekly cycles of mobility.

      I agree it’s unlikely the model gets all states right though. Better to look at overall patterns.

      • Where has it been established that “reported death + covid positive test” is a more reliable number than “reported covid positive test”?

        Afaict, that is just a really dubious assumption people started making for some reason.

        • I had seen deaths + ICU because it will be a more stable quantity since a lot of people pass from therapy before dying, oscillations in the time required to die + report the death will cancel. A similar argument with positive cases is possible, but since deaths << cases it seems dubious to me too.

  12. Andrew -if you’re interested, I’d be curious to read your (non-overly technical) reaction.

    -snip-

    > Abstract
    COVID-19 has become a global pandemic, resulting in nearly three hundred thousand deaths distributed heterogeneously across countries. Estimating the infection fatality rate (IFR) has been elusive due to the presence of asymptomatic or mildly symptomatic infections and lack of testing capacity. We analyze global data to derive the IFR of COVID-19. Estimates of COVID-19 IFR in each country or locality differ due to variable sampling regimes, demographics, and healthcare resources. We present a novel statistical approach based on sampling effort and the reported case fatality rate of each country. The asymptote of this function gives the global IFR. Applying this asymptotic estimator to cumulative COVID-19 data from 139 countries reveals a global IFR of 1.04% (CI: 0.77%,1.38%). Deviation of countries’ reported CFR from the estimator does not correlate with demography or per capita GDP, suggesting variation is due to differing testing regimes or reporting guidelines by country. Estimates of IFR through seroprevalence studies and point estimates from case studies or sub-sampled populations are limited by sample coverage and cannot inform a global IFR, as mortality is known to vary dramatically by age and treatment availability. Our estimated IFR aligns with many previous estimates and is the first attempt at a global estimate of COVID-19 IFR.

    -snip-

    Of course, I’d be curious to read reactions of others as well.

    https://www.medrxiv.org/content/10.1101/2020.05.11.20098780v1

    • Given the uncertainties with the seroprevalence testing, I’ve been thinking that the IFRs derived from using symptomatic cases, discounted by a given asymptomatic rate (the CDC is saying 35%).mufht make sense.

      Seems like this might even make more sense? I do have an advanced degree in armchair epidemiology, but my GREs weren’t top of my incoming class at Interwebs University.

      • Andrew –

        > I’m wary of talk of “the” IFR,

        I agree. Almost better, as someone said here, to consider it as separate diseases for different age groups.

        On the other hand…IFR has become a political pivot point, with scientists advocating oubkically for policy based on their interpretation of a single IFR.

        Not sure that the way through that is with better analyses (deficit model thinking insufficiency). But maybe there might be some value in at least attempting to counter misinformation if the whole “It’s basically like the seasonal flu” seems just wrong.

        If that argument really does lie outside the mainstream of scientific opinion, I would like to know that even if (1) that doesn’t make it dispositive and (2) I don’t expect a mainstream consensus to convince anyone who’s ideologically disposed to disbelieve it.

        And just in an academic sense, I’m kinda curious what you think of the methodology.

      • Yeah, if you say that you can’t use serology studies because “mortality is known to vary dramatically by age and treatment availability”, then why doesn’t that also tell you not to bother with a global IFR?

        I also think any estimate of a global IFR at this point is going to be — if close to right — right mostly by luck, since the nations we have halfway decent data from are not anything like a representative sample of the world in terms of demographics and medical care (they’re mostly developed nations, therefore older demographic but better medical care).

        • confused –

          > if you say that you can’t use serology studies because “mortality is known to vary dramatically by age and treatment availability”, then why doesn’t that also tell you not to bother with a global IFR?

          Well, just because you think that serological studies are ill-suited for the task doesn’t imply that you don’t think that there’s any value in determining an IFR.

          Personally, I think the value is quite limited – except to the extent that world famous epidemiologists with powerful reputations are going on national TV campaigns to argue that government policies to address the pandemic are “draconian,” and that the similarity of the COVID-19 IFR to the flu means that COVID-19 is “nothing to be scared about” (paraphrasing).

          So if studies show that the IFR is considerably higher than the seasonal flu – will that have some value? I don’t know. If other methods for determining the IFR are more reliable, does that matter? I don’t know. But I think it’s worth considering.

          Should we just not even bother? Should we just take those epidemiologists at face value? What’s the better alternative?

        • >>Well, just because you think that serological studies are ill-suited for the task doesn’t imply that you don’t think that there’s any value in determining an IFR.

          Sure, if the only problems were serology-specific. But they say mortality rates “vary dramatically by age and treatment availability”; since both those factors are going to vary dramatically between different nations, a global average is kind of questionable.

          And I don’t think we have particularly decent COVID data from any low-median-age / less-developed country yet; the (relatively…) good data is all places like Europe, South Korea/Japan/Taiwan/Hong Kong, the US…

          Extrapolation to sub-Saharan Africa (very low median age, but low medical treatment availability, and some nations have very high HIV/AIDS prevalence), South Asia or Latin America (low-ish median age, better medical treatment availability but probably not as good as the countries we have decent data from) is going to be, IMO, between very questionable and useless.

          Especially since we don’t know how much of an effect weather/seasonality will have, and the climates of those places are very different…

        • And if all you really know is “it’s got to be higher than what these other epidemiologists are saying”, I don’t think it really adds any value to come up with new numbers if the data isn’t good enough to make those numbers meaningful.

          And I rather doubt IFR by itself is the right measure to make policy decisions on anyway. What matters for making decisions is cost vs benefit -> how much our measures will affect the outcome, and what the opportunity costs (not just economic) of taking those measures are. For that, transmission characteristics are at least as important as IFR.

          Also:
          >> a global IFR of 1.04% (CI: 0.77%,1.38%)

          The fact that there’s less than a 2x difference between the upper and lower CI seems very questionable to me since South Asia + Sub-Saharan Africa + Latin America are a pretty significant fraction of the world population, and we don’t have much good data from there.

          Without a better handle than I think anyone has now on the relative importance of:
          – demographics (% of the population in high-risk age groups)
          – weather/seasonality
          – vitamin D
          – population density / multi-family residences / etc.
          – access to and quality of healthcare

          I think any extrapolation to areas other than fairly densely populated, temperate-zone regions with developed world demographics (high median age / large elderly population) involves huge uncertainties.

        • confused –

          I think that the value is limited.

          But I also think that it is being used to some degree for policy decisions no matter what you or I think of it’s merit.

          Further, it is being used, widely, in a media-focused policy advocacy campaign. It is being used, very deliberately and in a very focused manner to critique public health policies to address the pandemic. It is being used to justify the claim that shelter in place mandates are “draconian” and that they are “tyrannical” and that they are killing people and crushing our economy and being used to wage a political campaign against Trump. They are being used to justify lifting shelter in place orders prior to meeting the guidines suggested by the CDC.

          As such, I think there is some merit in vetting the claims being made, and deliberating the best method for coming up with a number.

          That doesn’t mean that I favor using that number without appropriate caveats as to its limited value.

          Should we just not even bother? Should we just take those epidemiologists at face value? Is there a better alternative?

        • I just don’t see how anyone can know the *global* IFR with a confidence interval less than a factor of 2.

          We have very little data from South Asia and sub-Saharan Africa and very little good data from Latin America. Combined, these areas make up nearly half the world population, so they will have a huge impact on the global IFR. I don’t think we can yet know whether the IFR will be dramatically lower than Western Europe’s because of the far younger demographic, or whether less access to healthcare will cancel out that effect, or whether HIV/AIDS prevalence will make it worse than elsewhere in sub-Saharan Africa.

          >>But I also think that it is being used to some degree for policy decisions no matter what you or I think of it’s merit.

          Sure. I don’t think that justifies coming up with implausibly-precise values, though.

          >>Should we just not even bother?

          If you mean “bother with calculating IFRs”, certainly we should.

          It is of course always worth doing to get better data and do better analyses – that’s kind of the point of science – but only if uncertainty is appropriately addressed.

          >>Should we just take those epidemiologists at face value?

          No, of course not – but we also shouldn’t make the same error in the opposite direction, by claiming implausibly narrow confidence intervals for a higher IFR.

          >> Is there a better alternative?

          Yes – don’t try to calculate a global value at this point in the pandemic, since we don’t have good data from over half the planet. Focus on calculating values for more limited areas – like individual countries or states (and even those need to consider heterogeneity).

        • IFR would be meaningful if it were IFR_u where u = unconstrained. Then other estimates would be IFR_c where c = the constraints applied in effort to slow the spread.

    • I can’t get past figure 1. Is that telling us opening up the testing tap is going to reveal more… hospitalizations?

      There 95% confidence interval seems absurdly narrow, given the data they’re working with. A range of 0.77% to 1.38%, based on just CFR and percent positive?

      Beyond that, I question their underlying assumptions and model. I don’t see any reason to assume that infinite testing capacity would result in all infections being identified.

  13. I did an analysis demonstrating Ioannidis’ statements about IFR because its quite obvious that apparent values will vary over a large range depending on the risk groups infected. That’s why the estimates from serological studies that include demographic information are the best ones.

    I will start with Ferguson’s age cohort IFR estimates. Virtually every serologic study in the US shows his IFR’s are at least a factor of 2 too high. This gives me the following values. I have combined the 0-9 and 10-19 cohorts and the 20-19 and 30-39 cohorts.

    Age cohort. IFR. % of US population
    1. 0-19 0.002% 27%
    2. 20-49 0.0275% 28%
    3. 50-59 0.3% 14%
    4. 60-69 1.1% 14%
    5. 70-79 2.5% 11%
    6. 80-90 4.6% 4.2%
    Total IFR: 0.67%

    Its now easy to do some calculations on apparent IFR’s depending on the age profile of those infected. For reference, in the US, expected mortality is about 2,840,000 per annum.

    Scenario. #0
    Age cohort % infected. Fatalities. Infections
    1. 5% 1141 4,455,000
    2. 10%. 2,541 9,250,000
    3. 15%. 20,790 6,950,000
    4. 40%. 203,280 18,500,000
    5. 60%. 544,500 21,780,000
    6. 80%. 510,048 11,088,000
    Totals 1,282,300 72,023,000
    Apparent IFR: 1.78%

    Scenario. #1
    Age cohort % infected. Fatalities. Infections
    1. 10% 2282 8,900,000
    2. 20%. 5,082 18,500,000
    3. 30%. 41,580 13,900,000
    4. 40%. 203,280 18,500,000
    5. 60%. 544,500 21,780,000
    6. 80%. 510,048 11,088,000
    Totals 1,306,772 92,668,000
    Apparent IFR: 1.41%

    Scenario. #2
    Age cohort % infected. Fatalities. Infections
    1. 10% 2282 8,900,000
    2. 20%. 5,082 18,500,000
    3. 30%. 41,580 13,900,000
    4. 35%. 177,870 16,200,000
    5. 40%. 370,260 14,520,000
    6. 45%. 255,042 5,544,000
    Totals 852,116. 77,564,000
    Apparent IFR: 1.10%

    Scenario. #3
    Age cohort % infected. Fatalities. Infections
    1. 40% 7128 35,640,000
    2. 30%. 7,623 27,720,000
    3. 20%. 27,720 9,240,000
    4. 15%. 69,300 4,620,000
    5. 10%. 90,750 3,630,000
    6. 5%. 31,878 693,000
    Totals 234,399. 76,593,000
    Apparent IFR: 0.31%

    Scenario. #4
    Age cohort % infected. Fatalities. Infections
    1. 60% 10,692 53,460,000
    2. 60%. 15,246 55,440,000
    3. 30%. 41,580 13,860,000
    4. 10%. 50,820 6,300,000
    5. 5%. 45,375 1,815,000
    6. 5%. 31,878 693,000
    Totals 150,216 129,753,000
    Apparent IFR: 0.116%

    This also demonstrates the imperative to protect nursing homes from infection. It appears in the US that 40% of all fatalities have taken place in residents of these homes. Some governors like DeSantos in Florida did a good job. Others in New Jersey, New York, and Pennsylvania did a terrible job and cost tens of thousands of lives. This also explains the apparently higher IFR in these locales and the much lower IFR in Florida.

    CFR’s are much more uncertain because of massive differences in rates of testing. Any statistical analysis is thus subject to more confounding factors that serological studies with a defined demographic distribution. And then there are the “Covid” death numbers. It’s a matter of intense controversy. Colorado just reduced theirs by 25%.

    Ioannidis is aware of all this and is attempting to correct his numbers based on the available data on age structure of those tested. More work needs to be done.

    • Your numbers are wrong. The 5/16 ICL study came up with roughly 0.8$ IFR for the US.

      On what are you basing your claims of relative nursing home infection rates between Florida and NY et al?

      The imperative of protecting elderly from infection has been obvious since Wuhan. That’s why the ICL 5/16 report suggested full suppression — because it appeared extremely difficult to selectively protect the most vulnerable while the infection rages among the less-vulnerable populations. Sweden is a good example of how difficult it is to selectively protect the vulnerable.

      • “On what are you basing your claims”

        He bases them primarily on ideology, just as he does in his climate science denialism, with a healthy dose of “I’m smarter than all of the specialists”.

        With that level of confidence, it doesn’t matter if his numbers are wrong.

    • Enough! Please take these political arguments over to those other blogs. There are a million places on the internet for political debates. This is one of the few places we can discuss statistics and social science. Thanks for understanding.

  14. I’m curious about the states where R is below 1 for the entire time series.

    For example, Montana https://mrc-ide.github.io/covid19usa/#/details/MT

    I can understand that there are few cases there (it’s pretty isolated in general, and tourism has ground to a halt since March), but the fact that there’s no exposure doesn’t, to me, imply that the R itself would be low.

    I’m missing something, but what?

    • Imagine you have some location where the normal activity is to live on your large ranch, and only see other people every 2 weeks when you drive into town to do a large shopping run. You go shopping and get sick… what’s the chance you’ll spread the virus? Pretty low.

      Now, obviously that’s not everyone in Montana, but I think places like AK and MT and ID there are many people who have everyday normal life activities which are basically equivalent to what a city dweller would call “shelter in place” already.

      • That is pretty much the way it is where I live in rural NW California. The main differences in pre/post Covid life are that we no longer have Sunday breakfast at the community center, and when people come over for coffee we sit outside.

      • Yep.

        The largest city in Montana (Billings) is about 100,000 inhabitants – pretty small by the standards of the coastal US.

        Wyoming basically has no cities (Cheyenne and Casper are 60k-ish, everything else much smaller).

        There are probably not that many concerts, huge sports stadiums, etc. in these states either to serve as super-spreader events.

        (Depending on how seriously you take the “contact networks reduce the herd immunity threshold” idea, that might make a huge difference in these highly-rural areas. Once the grocery store and feed store workers are immune, transmission might drop off drastically. The only reason I’m not convinced that the threshold *will* be dramatically lower in the rural US is churches.)

  15. It doesn’t currently seem possible to comment on open review, although I’ve asked to be added.

    A couple comments on the projections:
    -“Rt is above 1 in 24 states” seems wrong or at least misleading, even after including the 95% CI of 20-30. Under a null hypothesis that R is 0.95 everywhere, since there appears to be a standard deviation of around 0.17 in the estimation errors, you would see around 20 of the states have point estimates above 1 just by chance.

    -The text itself also seems to oddly ignore uncertainty in places, for instances claiming that transit has no significant effect but time at home does, even though both have huge credible intervals that overlap (and indeed the coefficients are probably negatively correlated with each other due to the strong correlation between different types of mobility).

    -Until I got to the methods, I had the impression that they were estimating recent Rt from data and concluding that interventions had moved it above 1. But actually, they are *projecting* that it is above 1 based on an assumed linear relationship between mobility and cases. Sure, this is a reasonable first-pass assumption but I think it’s important to have it be up-front that this is what the current estimates are based on. The way the abstract is worded, it sounds like *current* estimates are from data and *future* estimates are from the projection, but actually estimates from *3 weeks ago* are from data and *current* estimates are from the projection.

    • >>-“Rt is above 1 in 24 states” seems wrong or at least misleading, even after including the 95% CI of 20-30. Under a null hypothesis that R is 0.95 everywhere, since there appears to be a standard deviation of around 0.17 in the estimation errors, you would see around 20 of the states have point estimates above 1 just by chance.

      This is why I am pretty skeptical of even correct models making very useful predictions at this point: many states seem to be pretty close to 1, so projecting a small error makes a huge difference (slightly below 1, it dies out eventually; slightly above 1, it eventually gets quite large).

      rt.live has only 5 states above 1, but tons of states where even the 50% confidence interval is on both sides of 1…

      • +1

        I especially hope they address your comment about the need to clearly delineate between what is estimated and what is projected. The solution could be as simple as cross-hatching the confidence bands for projected results (which are the last 2 weeks of results for Rt, case counts, and infection events*), or some other simple visual indicator.

        Like you, I am also curious about why the mobility effect was modeled as linear on a logit link, or rather about how safe an assumption this is. The authors observe a distinctly nonlinear association between population density and Rs (see their Figure 3), which tends to suggest that the “linear on a log link” mobility effect assumption may not be safe. With a nonlinear association for population density, it seems like it may not be a good idea to assume a particular relationship for mobility without checking.

        Since there are 50 states, I’d guess they have enough data to investigate by modeling mobility effects as a spline on a logit link. But they already have many secondary analyses looking at this or that assumption, and may be to crunched for time to add much more. With open code and open review, the authors could fairly just say “try it yourself.”

        * Approximately, see page 20 of their report.

  16. I have been directed here by Andrew.

    My background is in probabilistic forecasting of complex computer models. Particularly for oil and gas reservoirs, where there is a reservoir simulator (which solves complex PDE’s), and a simulation deck containing parameters for the particular model. The simulation may take from minutes to days to perform a single run. I use surrogate models and HMCMC (NUTS).

    In the reservoir world, we have the problem of fitting to historical measurements, and then the problem of probabilistic forecasting. I have applied Bayesian techniques for both, where the number of parameters may vary up to 1000.

    I see the LMIC short-term forecasts dashboard (code at https://github.com/mrc-ide/squire) uses an age-structured SEIR ODE model. From what I can see the current Stan-based work does not use this ODE model. Has anybody tried using Bayesian techniques with the ODE model? I know Stan can explciitly use an ODE, but my own software can also, with careful use of surrogates, apply any black box ODE or PDE model.

Leave a Reply to steve Cancel reply

Your email address will not be published. Required fields are marked *