Whassup with the haphazard coronavirus statistics?

Peter Dorman writes:

This piece by Robinson Meyer and Alexis Madrigal on the inadequacy of Covid data is useful but frustrating. I think they could have dispensed with the self-puffery, rhetoric and sweeping generalizations and been more detailed about data issues. Nevertheless the core point is one that you and others have stressed, that too much attention is given to analysis when measurement itself is often the biggest problem.

There is a secondary question the authors never raise: why people in leadership positions were so acquiescent to inaccurate and misleading data production. No doubt the culture of “data-driven” decision-making promulgated by MBA, MPA, MPH and similar programs contributed, but I think there is also an incentive problem. Such people are judged on the basis of data delivered to peers and constituents, however flawed. If you “flatten the curve,” even if the curve itself is illusory, you get rewarded. This is speculation of course.

I agree with Dorman on all points. The linked article is definitely worth reading. Key quote:

Before March 2020, the country had no shortage of pandemic-preparation plans. Many stressed the importance of data-driven decision making. Yet these plans largely assumed that detailed and reliable data would simply . . . exist. They were less concerned with how those data would actually be made.

They do they say some things that don’t completely make sense, like “Data are just a bunch of qualitative conclusions arranged in a countable way.” Huh? Also, I wish they’d point to other data summaries such as Carnegie Mellon’s Covidcast (discussed here). Their main point, though, that data are important and don’t come by themselves, is super important.

The only thing that puzzles me is the idea that the government should be so bad at this. The Bureau of Labor Statistics is 136 years old! And then there was all the data collection in the New Deal period, and the World War II mobilization. I guess that data got a bad rap during the Vietnam War, back when government officials promised the country they could run the war as efficiently as they’d run Ford Motor Company. Still, the Federal data system is huge, and they have a lot of competent, serious professionals all over. It didn’t seem so outlandish to assume that the CDC was on top of this sort of thing.

I feel like Meyer and Madrigal, or somebody, needs to write a follow-up article on how this all went wrong. They write, “Perhaps no official or expert wants to believe that the United States could struggle at something as seemingly basic as collecting statistics about a national emergency,” and they talk about federalism—but federalism is an issue with just about every government statistic. I remain baffled as to what has been happening here.

Maybe we should look at it another way by comparing to familiar economic statistics. We all know that inflation and unemployment measures are imperfect: inflation depends on what’s in the basket, and there’s also this weird thing where inflation is supposed to be a bad thing but it’s also supposed to be a healthy thing if “property values” are going up. The unemployment numbers don’t include people who are looking for work. And much of the time it seems that the stock market is used as a measure of the economy, which is really weird (no matter what that Stasi guy says). So, my point is that even in the world of economic statistics, there are difficulties of interpretation, and opinion leaders will often grab on to numbers without thinking clearly about them. This fits in with Meyer and Madrigal’s point that, not only does it take important work to collect, compile and clean data, it also takes work to put data in context.

Measurement, baby, measurement.

53 thoughts on “Whassup with the haphazard coronavirus statistics?

  1. I agree with both your points and Peter’s. In the case of COVID data, I don’t think we should view it as just another example of poor data measurement – much government data (in the US and elsewhere) is of reasonable quality. COVID data is particular poor in the US as a result of the way our health care “system” is organized. Or rather, how it is disorganized. Our “free market” approach to health care (despite the heavy involvement of government in that sector) shows up in inconsistent measurement, patchy availability, and many associated problems of data quality. Even HIPAA, despite worthwhile motivation and important protections, contributes to the difficulty of obtaining good quality data. Private insurers probably have the best data available, but it is difficult to obtain this (though they share more with each other than with the government or the public). Asking the basic question: who owns health care data? will quickly show many of the difficulties associated with collection of good quality data.

  2. > The only thing that puzzles me is the idea that the government should be so bad at this.

    Possibly the CDC was infected with the disease that has infected so much of the rest of the government’s data-collecting apparatus:

    > For most of the twentieth century, Census Bureau administrators resisted private-sector intrusion into data capture and processing operations, but beginning in the mid-1990s, the Census Bureau increasingly turned to outside vendors from the private sector for data capture and processing. This privatization led to rapidly escalating costs, reduced productivity, near catastrophic failures of the 2000 and 2010 censuses, and high risks for the 2020 census.


    • For some context




      While I agree that the cutover from the CDC to private contractor TeleTracking caused a lot of problems, I think private contracting is in principle a good idea. My experience with government agency data processes is a lot of emailing excel spreadsheets around that ends with someone slowly typing row-by-row into some MS Access 97 interface to SQL Server 2000. I think the best private-sector data management is probably way ahead of the best government agencies.

      The problem is much of the private-sector is just as bad or worse, with no accountability. The same spreadsheet-email-ETL happens at many private “analytics” companies. And it seems like government in the US either lacks the expertise or the incentives to evaluate their contractors critically. For the libertarian types, it bears repeating that privatization-by-way-of-state-contract is at least as corruptible a process as big-government itself.

      • Speaking as someone who has been on both sides of government technical services contracts, the biggest problem seems to be that it’s a lot of work to justify anything but the cheapest bid for a lot of things, and a lot of government employees do the safe, easy thing and accept the low bid. In a public sector role, we had a lowball bid scare off everyone competent, and the winner could barely do the work and was constantly missing deadlines and fibbing and making excuses. Painful.

      • My experience with government agency data processes is a lot of emailing excel spreadsheets around that ends with someone slowly typing row-by-row into some MS Access 97 interface to SQL Server 2000.


        And it seems like government in the US either lacks the expertise or the incentives to evaluate their contractors critically. For the libertarian types, it bears repeating that privatization-by-way-of-state-contract is at least as corruptible a process as big-government itself.

        There is an entire ecosystem surrounding interns printing out MS access sheets from one computer and then typing what they see into excel spreadsheets on another before those numbers get to you.

  3. The case/death numbers are only accurate to within ~70%. We have known this since last summer:

    Officials at the Wadsworth Center, New York’s state lab, have access to C.T. values from tests they have processed, and analyzed their numbers at The Times’s request. In July, the lab identified 872 positive tests, based on a threshold of 40 cycles.

    With a cutoff of 35, about 43 percent of those tests would no longer qualify as positive. About 63 percent would no longer be judged positive if the cycles were limited to 30.

    In Massachusetts, from 85 to 90 percent of people who tested positive in July with a cycle threshold of 40 would have been deemed negative if the threshold were 30 cycles, Dr. Mina said. “I would say that none of those people should be contact-traced, not one,” he said.

    Other experts informed of these numbers were stunned.

    “I’m really shocked that it could be that high — the proportion of people with high C.T. value results,” said Dr. Ashish Jha, director of the Harvard Global Health Institute. “Boy, does it really change the way we need to be thinking about testing.”


    If the threshold (or other details of the test) differ across space and time that can be very misleading. There is basically no data available on that.

    Also, there are strains that test negative when using the standard PCR primers. It was thought an area was especially resistant to covid, but it was the tests giving false negatives:


    And the reports of asymptomatic transmission from fully vaccinated people are starting to appear:

  4. > The only thing that puzzles me is the idea that the government should be so bad at this.

    Not that I’d suggest a counterfactual scenario where more funding would ne necessarily have netted better results, but it isn’t likely entirely irrelevant that public health funding has been increasingly starved of funding for decades.

  5. “The Bureau of Labor Statistics is 136 years old!”

    Now, this point is important more generally.

    I don’t recall the title any more, but I remember having skimmed through a monograph on the history of statistics. Therein, the author explicitly made a separation between “statistics as a practice of collecting statistics” and statistics as a scientific field we know now. Maybe the former notion has received too little attention?

    We also know from discussions in this blog about the importance of measurements and measuring; i.e., instead of analyzing data, the latter notion of statistics.

  6. … “puzzles me is the idea that the government should be so bad at this”

    Then the key question is why you reflexively assume that ‘the government should be good at this’ (?)

    There are mountains of current and historical data demonstrating that government politicians and technocrats are highly inefficient and often incompetent at problem solving.

    The most fundamental political divide among people is those who instinctively trust government versus those who do not.

    The CDC SARS data measurement processes are a huge mess. This is not surprising at all to objective observers.

    (notice how many commenters here quickly focused upon the “puzzles me” statement)

    • Lindell:

      I would not expect the government to be particularly good at problem solving. I agree with you on that.

      The thing that surprised me was that the government would be bad at data collection. The government has had lots of experience with high-quality data collection. I guess you’re right that the CDC has been a mess; I can’t disagree with you there. It still puzzles me what went wrong, given the high quality of data collection by the BLS, Census, etc.

    • Lindell –

      > The most fundamental political divide among people is those who instinctively trust government versus those who do not.

      I think that’s a cartoonish and malignant characterization. Few people in this country have some kind of blind “trust” in government, and almost everyone relies on government to some extent.

      While there is variance along that scale, I think that we also have a lot of related cynical exploitation; IOW, where people might quote government crime statistics when they want to make a case for “law and order” but then turn around and say that COVID death statistics or vaccine statistics are completely unreliable because of massive conspiracies to defraud the public.

      Often, levels of “trust” in government are a function of the party affiliation of who is judging in relation to which party is in power.

  7. > “Data are just a bunch of qualitative conclusions arranged in a countable way.”

    Is this a different way of talking about measurement? People make qualitative judgments about whether measures are good operationalizations of the theoretical constructs.

    Or is it a frequentist statement, as in people make qualitative conclusions about whether statistical tests are significant?

    • It still maintains that the stuffs are rankable! – a hard assumption.

      I am not kidding, although I know better other countries, & econometrics up from primary data…

    • In an epi context I read this as “what counts as a diagnosis or cause of death is a qualitative conclusion”? But maybe that’s wrong.

  8. > Nevertheless the core point is one that you and others have stressed, that too much attention is given to analysis when measurement itself is often the biggest problem.

    Analysis, statistics, and especially machine learning are gadgetbahns. Measurement and data collection are trains.


    > Any time somebody in power proposes to solve an infrastructure problem by first developing some new technology, or by using some proposed technology that hasn’t been delployed yet, we should be distrustful.

    > Often, the proposal may simply be an excuse to not invest in infrastructure today, because tomorrow some techno-fix will come along and solve all our problems ‘for free’. This tactic may work especially well if the proposal includes an appeal to some futuristic dream or a nationalistic project.

    “Updates 3 times a week with thorough scrutiny? Pathetic. Our magic private contractor will get you up-to-date data EVERY DAY!”

  9. I forget who this saying is attributed to, but I think it’s relevant here: “measure what you value, because you value what you measure.”

  10. I think there’s possibly a distinction to be made here about the type and speed of data.

    The Bureau of Labor Statistics, the Bureau of Justice Statistics, the Census, and many other government statistical agencies are good at carefully collating and reporting information. Often the process takes months or years, they issue revisions of old data, and so on.

    But in COVID, the need was for data to be accurate and up-to-date within days, not months. For example, the National Center for Health Statistics collates death records to produce statistics on causes of death. Usually, death certificates are completed days to weeks after a death (depending on the investigation needed to determine cause of death, get test results back, get paperwork done, etc.), are submitted to state agencies after that, are then subject to more reporting delays as the states aggregate them, and finally get reported to the federal government. Accurate data eventually arrives — but only weeks or months after each death. That might be acceptable for annual reporting of causes of death and for some public health research, but it’s not useful at all during the pandemic.

    I don’t know too much about the health reporting systems otherwise, but I assume this is a key theme: Slow reporting is relatively easy, while rapid and accurate reporting requires close connections between agencies and well-designed reporting systems.

    • As they often say — “Cheap, fast and accurate: pick any two.” Government data collection has long picked the first and third.

  11. In the land of the blind, the one eyed data provider is king. Those who know a lot about ensuring high qaulity data in context likely were excluded.

    Though I am biased, but I do think OHDSI was doing a really good job at ensuring high quality data from hospitalisations…

    Even a fair percentage of researchers likely still believe that data entry errors (magically) even out on average.

  12. Funny I suppose as in most languages “statistics” is derived from Latin word meaning “of the state”, state probably meaning “a political entity”. I.e. measurement of the state. Not surprising considering the early history of statistics and the fact that a lot of what a state does is dependent of collection of data. Taxes mainly but land property, patents and later voting. etc.

    Could also be state as condition which is only explicit in Finnish “tilastotiede” (Science of a collection of conditions).

    Chinese and Malay and some other languages use “science of a bunch of numbers”

  13. I agree with Alex Reinhart above.
    “Cheap, fast and accurate: pick any two.” The complaints here have to do either with trying to do all three or with pivoting from the gov’ts typical role of “cheap and accurate” to a role that includes “fast.” They got numbers fast and on the cheap, and the data quality is what you’d expect from that.

  14. Since you bring up how misleading the unemployment rate can be:
    “I was surprised to find one element of complete agreement. Mason may just be unaware how far contemporary labor economics has advanced from the policy world’s fixation on the unemployment rate — people without jobs who are looking for jobs.”
    “Labor economics has been focusing on the employment-population ratio for years. Ed Lazear harped on its decline. Casey Mulligan’s book “redistribution recession” excoriated Obama policies precisely on labor force decline. Most labor economics focuses on employment. Labor economists largely pay little attention to the unemployment rate, calling it “job search.””

      • Good luck with that. The press still quotes the Dow Jones Industrial Average every day despite its manifest failings.

        If you’re interested in some modern work to try and replace the unemployment rate and persuade journalists to try and think beyond what they’ve always done, I point you to a recent proposed replacement at lisep.org. (Full disclosure: I assisted LISEP in this effort.)

        • > The press still quotes the Dow Jones Industrial Average every day despite its manifest failings.

          Further than that, the financial press still quotes the Dow Jones in terms of points rather than percentages, meaning that single-day-gain/drop records are broken every month or so. Very basic mathematical error

        • What makes you think it’s an error? Those are records in terms of the metric cited, and since using that metric gives rise to more headlines (that people will read) why would they quote in terms of percent?

          Misleading perhaps. Error? Not likely.

        • This is tangential to the data-quality discussion but it reminds me of something. I occasionally eat lunch at a place that seems popular for financial planners to meet with their clients. So a couple times a year I overhear these discussions where the financial planner guy (they are always guys, by the way) apparently goes through a list of stocks and rates each one as a “hold”, “strong buy”, “sell” type thing and tells them to reshuffle their stock portfolio based on things like what sector is supposed to helped or harmed by the latest election, etc.

          Do people really pay fees to these guys and expect to somehow beat the market by going with his stock picks? I’d have thought by now anyone with enough sense to have accumulated some wealth would also be clued in to how much of a scam that stuff is. When we live in a society where people piddle away a big chunk of their lifetime savings on paying someone to read tea leaves it’s no surprise we (as a society) can’t be bothered to collect meaningful data in a midst of a pandemic. We live in a culture based on wishful thinking and there’s always some one positioned to profit from it.

        • You’re right, of course, but you’re missing a crucial sociological component. As a distinguished (anonymous) economist friend of mine says:
          “The only place I see true value is when the marked goes down a bunch, retail investors wait, sell at the bottom, wait because they are scared, and buy back in near the top. If an advisor can prevent one of those incidences and keep them invested they will have likely earned their fee for at least a bunch of years and possibly some excess return.”

          But yeah, the rest of it is just noise. But remember: noise is mean zero, so there are much more terrible things to worry about!

        • Name, Jonathan:

          Youall might be disappointed (or not) to learn that I’m one of those fools who uses a financial advisor. Our money’s sitting with TIAA. They take a fee every year (0.8%, I think it is, but maybe I’m off by a factor of 2; who knows?) and they invest it for us, advise us what insurance to buy, etc. I figure it’s worth it because whatever I do would probably lose me more than 0.8% per year compared to whatever they’re doing. I don’t expect them to read tea leaves or “beat the market”; I just want to get reasonable returns, whatever that means. Investments are complicated! I’m not complaining. It’s pleasant to be rich. I just don’t want to be investing on my own, and so I’m paying the going rate for TIAA’s services.

        • No surprise here… That’s essentially just what my former colleague is saying. There are lots of things I could do for myself but don’t. No shame in that at all. The interesting thing (and I’m probably putting words in Name’s mouth here) is that the competition between these guys means that they have to talk the game he describes: not “I will be a careful steward of your money,” but “I have information that is being ignored by the market and will make you rich.”

        • I think the LISEP criteria of not counting people as employed if they make less than some nominal figure means they’re not talking about “employment” in the same sense that Cochrane and other economists are. For example, a common puzzle in macro is why a drop in aggregate demand tends to result in an increase in unemployment rather than just a drop in wages & prices (which the 1921 recession somewhat resembles). The solution to that is commonly said to involve nominal “stickiness” in wages such that they are rarely reduced in nominal terms, even if inflation might reduce them in real terms. LISEP would just say any job which pays sufficiently poorly doesn’t count.

        • You are correct. Their intention, though, is not to discuss mere search behavior and matching success. Their object is to quantify, in some sense, the ability of the economy to generate jobs with livable wages. So they have concerns that are different than what economists typically focus on. Like all things, it is useful if it is calculated well and it is the calculation one is interested in. It is consistent with the BLS data (it uses the same data) and provides an alternative perspective.

  15. “Many stressed the importance of data-driven decision making. Yet these plans largely assumed that detailed and reliable data would simply . . . exist. They were less concerned with how those data would actually be made.”

    I encounter this A LOT with mid- to upper- managers. There’s this assumption that the data will just appear in a nice dashboard for them, with as much slicing and dicing as they’d like. Except that the data come from three or four different sources that don’t connect to each other by default, dimensions that could be used as primary keys are in different formats, and one of the sources has decided all you are getting is weekly data.

  16. I applaud the work of the Covid Tracking Project and wonder what’s the back story is why they’ve shut down. I can’t agree with them that data quality is the chief reason why our pandemic response failed. Politics, misinformation, and lack of empathy are more important factors. It’s easy to look back and say we should have done this and that. But it’s a new virus, we’re developing new tests, we’re cutting corners to do things fast, we’re adapting to new information all the time. If the movie were re-run, we still would not have the knowledge last January that we have today. Like Andrew has often said, we have to embrace uncertainty. Instead, we expect the experts to offer the solace of certainty.

    When these authors say “Data-driven thinking isn’t necessarily more accurate than other forms of reasoning,” I wish to know what these other forms of reasoning are? If we had no data at all throughout 2020, what would have saved us from a worse outcome?

    From the measurement perspective, what could have helped is a stronger CDC that imposes measurement standards on the state health departments. The same type of issue plagues the clinical trials. I don’t understand why the FDA does not require every trial to use the same definition of a “case” and the same process to validate a case.

    • Kaiser:

      +1 for your second paragraph. To paraphrase Bill James, there’s no alternative to data-driven thinking. The alternative to good data-driven thinking is not no data-driven thinking, it’s bad data-driven thinking.

    • When these authors say “Data-driven thinking isn’t necessarily more accurate than other forms of reasoning,” I wish to know what these other forms of reasoning are? If we had no data at all throughout 2020, what would have saved us from a worse outcome?

      This is probably referring to evidence based medicine, which rejects the use of reason and prior information. There has been so much obvious stuff:

      1) If there is a pandemic in another country you need to filter people who have recenty been there at your border.

      2) If people’s lungs are working at low capacity leading to an oxygen deficiency it is beneficial for them to breath pressurized oxygen so that the remaining lung can operate at over 100% capacity.

      3) If people have vitamin and mineral deficiencies you should correct those deficiencies.

      4) Do not blindly put people on dangerous and expensive ventilators just because they have low oxygen sats. Likewise for any other dangerous and expensive treatment.

      5) Do not give pro-oxidants like HCQ to people undergoing oxidative stress. Especially not weeks after they have cleared the virus rendering its supposed mechanism irrelevant! And really, really especially not without monitoring for the main side effect (methemoglobinemia).

      6) Antibodies wane and viruses mutate, and especially towards primarily non-viremic viruses. If you want herd immunity a large percent of the population needs to get them at near the same time. Ie, flatten the curve means perpetuating the virus indefinately.

      7) Immunity monocultures are fragile, you do not want everyone in the world to be immune to the exact same protein sequence.

      • Anoneuoid said, “Antibodies wane and viruses mutate”

        This brings to mind hearing (in discussing a friend who is undergoing chemotherapy) that there is some belief that the coronavirus mutates more in people who are immunocompromised (e.g., those in chemotherapy) . A quick web search got me this: https://www.scientificamerican.com/article/covid-variants-may-arise-in-people-with-compromised-immune-systems/
        Anyone here know anything more about this?

        • Yes, when there are weak antibodies the virus can easier mutate to escape the antibodies. However, your link refers to this paper where the patient was not only immunocompromised but also received convalescent plasma:

          There was little change in the overall structure of the viral population after two courses of remdesivir during the first 57 days. However, after convalescent plasma therapy, we observed large, dynamic shifts in the viral population, with the emergence of a dominant viral strain that contained a substitution (D796H) in the S2 subunit and a deletion (ΔH69/ΔV70) in the S1 N-terminal domain of the spike protein. As passively transferred serum antibodies diminished, viruses with the escape genotype were reduced in frequency, before returning during a final, unsuccessful course of convalescent plasma treatment.


          The data from this single patient indicate caution should be used for convalescent plasma therapy in patients with immunosuppression of both T cell and B cell arms; in these patients, the administered antibodies have little support from cytotoxic T cells, thereby reducing the chances of clearance and theoretically raising the potential for the evolution of escape mutations in SARS-CoV-2.


          You can see this patient had basically no antibodies towards SARS2 until the convalescent plasma was first injected on day 63: https://www.nature.com/articles/s41586-021-03291-y/figures/7

          After the plasma treatment selection for variants then happened almost immediately: https://www.nature.com/articles/s41586-021-03291-y/figures/10

          So it is not being immunocompromised per se that matters, but having the weak antibodies or inability to take advantage of those antibodies (due to weak immune cells).

          That is why they should be warning people to be very careful for the first week after vaccination, since for the Pfizer vaccination was reported to cause a transient lymphocytopenia (drop in immune cells). Since then we have seen in multiple papers (I think it is up to four or five now) that chance of testing positive for covid is ~40% higher (10%-100%) in that first week after the first dose.* I haven’t seen any data on all cause mortality/morbidity for that first week, but I expect that is higher as well. That goes especially in nursing home residents and similar.

          Anyway, if you wanted to select for new variants this is a near ideal way to do it.

          *If anyone wants those sources I have shared them on this blog before and can go find the link.

      • I forgot the two most ridiculous things we did not need to wait for data on.

        8) Using Ct thresholds up to 40 to indicate a positive case rather than determining the threshold based on presence of culturable virus.

        9) Sending covid patients into nursing homes instead of literally anywhere else.

        It is difficult to understand why someone would need to be warned about doing either of those, but they happened anyway.

  17. Can a stats expert please explain what this paper is saying:

    Results: We observed a small proportion of care home residents with positive PCR tests following vaccination 1.05% (N=148), with 90% of infections occurring within 28-days. For the 7-day landmark analysis we found a reduced risk of SARS-CoV-2 infection for vaccinated individuals who had a previous infection; HR (95% confidence interval) 0.54 (0.30,0.95), and an increased HR for those receiving the Pfizer-BioNTECH vaccine compared to the Oxford-AstraZeneca; 3.83 (2.45,5.98). For the 21-day landmark analysis we observed high HRs for individuals with low and intermediate frailty compared to those without; 4.59 (1.23,17.12) and 4.85 (1.68,14.04) respectively. Conclusions: Increased risk of infection after 21-days was associated with frailty. We found most infections occurred within 28-days of vaccination, suggesting extra precautions to reduce transmission risk should be taken in this time frame.


    This is possibly the worst written paper I have ever read but it is the first looking at the effect of vaccines on nursing home residents in the UK.

Comments are closed.