Low rate of positive coronavirus tests

As happens sometimes, I receive two related emails on the same day.

Noah Harris writes:

I was wondering if you have any comment on the NY State Covid numbers. Day after day the positive percentage stays in a tight range of about 0.85-0.99%. How can the range be so narrow and stable? Do you think we are at the limits of the test and there may be a significant amount of false positives?

And here’s Tom Daula:

Relatively old article, but I think it is interesting considering your analysis of the Stanford study. Another wrinkle for the measurement problem; both of contagious individuals and viral load sufficient to be related to death. The article doesn’t mention international comparisons.

The Times article, which is not so old—it’s from 29 Aug—is entitled, “Your Coronavirus Test Is Positive. Maybe It Shouldn’t Be.
The usual diagnostic tests may simply be too sensitive and too slow to contain the spread of the virus.”

I don’t really know what to think about all this, but I’ll share with you.

38 thoughts on “Low rate of positive coronavirus tests

  1. Yes NY has a significant proportion of false positives, It’d have to at that low level. I’m not sure if it’s 10% or 50% but it’s undoubtedly more than 5% of the positive tests that are not true positives.

    Remember if you contaminate 1% of the tests with your positive control, then you’ll get 1% positive rate, and that’s easy to do by accident. I expect that under these conditions people are doing better than that, but maybe they’re contaminating 0.2% of tests… that’d still be in the 10 or 20% of positives are false.

    • Now I’m commenting on things I understand poorly, but wouldn’t you expect that the contamination rate would be fairly variable, depending on whether some lab tech got a bad night’s sleep or was fighting with their partner, etc.? The original question is why the % positive is so consistent.

    • why do you state that there is a high proportion of false positives?

      E.g. New Zealand has practically no positives even though it does 100.000s of tests, I think a positive test rate of <0.03% which I would take as an upper bound for false positives. What do you make out of that?

      • Are NZ tests the same as US tests? I know that US testing runs 40+ cycles. If NZ decides to run only out to say 30 cycles, then they won’t detect microscopic contamination (10 extra cycles is about ~ 1000x extra amplification)

        This is actually a thing I’ve heard advocated here in the US, reducing the maximum cycle count so as to avoid this issue.

        This shows that NZ is doing around 100-200k tests a month

        https://www.health.govt.nz/our-work/diseases-and-conditions/covid-19-novel-coronavirus/covid-19-current-situation/covid-19-current-cases

        In the US we’re doing 700-800k tests a DAY. We’re doing in the US as many tests every day as NZ has done EVER.

        So how many false positives has NZ had ever since the start of the pandemic? If we are doing the same kind of test, then that’s what we’d expect to be generating EVERY DAY in the US.

        • I have worked with PCR data for a long time. A few years ago I had the assignment to review different validation plans for a diagnostic test. I know there is some rushing with COVID-19, but any diagnostic test should go through a validation, a series of experiments to assess it specifications. The cut-off for a yes/no test is determined based on the validation, typically a number near but below the truncation value. The truncation value is usually 40 but I have seen 45. Typically specificity, 1- the false positive rate, is reported as 99.9%, not 100%, when there are no false positives. Often there are false positives in a validation but the test will still have a specificity near 100%. We are dealing with a fluorescence measures that can show positive even without contamination.

          The NFL contamination case in August is an example of how a high false positive rate tied into a situation in a lab.

      • Also I definitely believe that false positives are related to true positives. You can’t contaminate a well with a positive sample if you don’t have any positive samples. NZ went a long time with no positive samples, during that period I’d expect very low false positive rates. Furthermore, probably if anyone gets a positive in NZ immediately everyone jumps on it and re-tests the original sample. If you get a positive here in the US where we’re generating 40000 new cases a day country wide, no one is going to pay any extra attention to it.

        To first order you might say the probability of a false positive is something like k * pp, where pp is the percentage of true positives and k is a number between say 0.003 and 0.1 but If pp = 0 then doesn’t matter how big k is you won’t get any.

        Of course it’s possible to contaminate with the synthetic positive control, but again, if everyone jumps on the positive result and does a re-test, re-testing will reveal it was spurious.

  2. I think the “positive tests” mean different things to different people. What should a “positive test” ideally indicate? A few options to consider: (1) Should a positive test only indicate presence or vestige of the virus? (2) Should it indicate virulence and the likelihood of a person’s own mortality due to Covid? (3) Should it attempt to classify patients into groups that quantify the certainty they will get sick (and for how long)? (4) Should it predict the likelihood that a person can infect another person, and under what conditions?

    It would be a welcome advance to be able to discern, separate, and quantify concerns here. My guess is that most of these are likely unknown.

    • Its worse than that. Different places use different primers, equipment, and sample collection then different thresholds for what counts as a positive.

      Diversity in an approach is fine but the problem is that how the details vary over time and location are unavailable then all the numbers get treated the same.

  3. If the only variation of the numbers were from random sampling variation, then the standard deviation would be about 0.35%, based on 90,000 tests per day (test count data from https://coronavirus.jhu.edu/testing/individual-states/new-york). Two SDs of this would translate +/- 0.7%. That’s close to the range stated (.85 – /99%).

    There would also be variation in the number of tests performed each day. I haven’t run numbers on that, but by eye it looks to have a weekly modulation.

    It doesn’t look like that variations are too much out of line, but I don’t know how they can be reconciled with false positive rates we’ve seen in the papers.

    • But isn’t it also rather implausible that the *genuine* rate would stay the same?

      That would imply that either testing was growing / shrinking in step with the spread / decline of the virus, or that New York was *right at* R=1 for quite a while.

      Or is there some reason why that is plausible?

      • It is not implausible that testing is “growing / shrinking in step with the spread / decline of the virus”: the more people in my circle being diagnosed to be positive, the more likely I am undergoing a test.

        • Yes, and this might be true in some places, but looking at the # of tests performed in NY it does not seem to be true there. The number of tests doesn’t seem to be changing that much, so it would still imply an oddly flat curve.

    • Hmmm, I get a different standard deviation but the same range. I think you misplaced a decimal for the SD.

      If the true infection rate of those tested is .92%, then I get a standard deviation of Sqrt(.0092 * .9908 / 90000) = .00032. So two SDs is .00064, which gives the range .856% – .984%, as you said.

      I think your .35% SD was intended as a percentage of the mean of ~.92%. But .00032 / .0092 is 3.5%, not .35%. Hence a 2 SD range of +/- 7% of the mean, which gives the right range.

    • AV –

      > These are not randomized tests, through a sparse, clustered set of interactions with a great deal of heterogeneity.

      Yes. And I would imagine that the positives and thus false positives might be clustered by region. Lite if the positives come from places where the base rate is higher than 0.85-0.99%.

      So then would the picture of the “base rate fallacy” effect be different than if there were no heterogeneity and the base rate was uniform?

      • Hmmm. Or actually the true positives would be clustered by region but the false positives not so – so (they’d remain a constant as a % of the number of tests)? So areas where the base positive rate is higher, the % of positives that are false positives is lower?

  4. Which NY State numbers are we talking about? Their lateral flow assay monitoring (known high number of false positives) or the PCR testing, where whole countries like New Zealand can have no cases despite continued testing?

    Doing quantitative PCR testing is more difficult than doing qualitative testing, see e.g. https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30424-2/fulltext

    I disagree with the idea that people with a low virus concentration in the sample should not be contact traced: we know that the virus concentration in the throat can decline from day 1 of symptoms, so the virus concentration may well have been higher (and the person quite infectious) before they got around to have a test sample taken; and we know that the infection usually develops in the throat/pharynx and then moves into the lungs, such that you can often find more virus in the sputum of hospitalized patients than in their throat.

    It’s kinda like when you find a burnt spot of ground: sure, that area may not be in flames now, but there sure was a fire, so you want to know whereit may have spread while it was burning. That’s what contact tracing does.

    You also do not know if a low virus concentration in the sample really means a low virus concentration, for example the swabbing may not have been done properly.

    Stopping an outbreak is always time-sensitive, so you don’t really have time to double-check results before you initiate tracing contacts and isolating them. And having high test numbers means lab technicians put trays of 96 or 144 samples in a machine, run a preset procedure, and determine a result. There are both known positive and negative controls on those trays. It might be useful, if we’re speculating about contamination, to find out how often labs have to discard results because the controls show a problem.

    • >>where whole countries like New Zealand can have no cases despite continued testing?

      Well, as Daniel Lakeland mentions above, if the cause of false positives is cross-contamination from genuine positives, then no false positives among PCR tests in an area without virus isn’t incompatible with a meaningful number of false positives in an area with virus.

      And the questionably “false” positives where the sample is really positive in the PCR sense (there is actual COVID RNA) but the person is not sick or infectious (the viral RNA is old fragments of virus, not “live” infectious virus) will only occur if some of the population tested has had COVID in the past.

  5. Greetings,

    Haven’t read all responses, but assume it is PCR tests and their false positives that are discussed. I may have missed it, but what exactly is the gold standard (post-test) used to verify if a PCR test is indeed a FP? (or a FN). Is it that they do a serology (antibodies) one or something else?

    Thanks.

      • So the test serves as its own ‘post-measure’ or gold standard.

        No wonder FP and FN rates are all over the place than.

        In most diagnostic tests, one needs to have a completely different and verifiable way of assessing the presence or absence of something (e.g. biopsy verified by open surgery to detect FP/FN).

        I’m pretty sure they do something else, instead of running the same test on the same sample over and over again, without the knowledge whether the specimen is positive or negative.

        Oh, well. We’ll see (or not).

        • “In most diagnostic tests, one needs to have a completely different and verifiable way of assessing the presence or absence of something”

          In mining and metal exploration all assays are done using the same chemical process, but checked using duplicates, certified blanks and certified standards. The check samples are inserted into the sample stream by the people collecting the samples. The samples are prepped and analyzed in the order specified by the collectors, and lab prepping the samples also splits every sample so it can be tested later.

          It’s more than sufficient to test for contamination.

        • Well, in designing the test, you run the test adding “nucleotide free water” instead of sample, and this is your negative control. For a positive control you run the test with known fragments of RNA in it (or known to have virus grown in culture in it). To prove that the test is sufficiently sensitive and specific you run the test on several 96 well plates with a known pattern of synthetic positives and synthetic negatives. You then analyze how often the test gives incorrect results.

          What I was referring to was when you get a positive result that you think might be from contamination of the test, you then rerun the test going back to the original swab sample on a different machine with a different lab tech at a different time in duplicate or triplicate etc… if you get all negatives, you can conclude contamination was the issue. This is the kind of thing you’d see them do when they get a sudden positive after weeks of zero positives in all of New Zealand for example.

    • What we really need is a test to tell us whether a symptomatic person is shedding virus and is therefore infectious. If positive the person is quarantined and contacts are traced and tested. If negative do nothing. From this perspective, false positive pcr can occur if the person has had Covid and has residual viral RNA (which lasts for weeks) but is no longer shedding live virus. False positives might also occur due to cross-reactivity with other corona viruses. False negatives should not really occur in those with recent onset symptoms as viral shedding occurs prior to and for the first week or so of the clinical course. How the swab is performed shouldn’t really matter, as those who are shedding will have viral RNA throughout the whole airway: mouth, throat, nose and nasopharynx.

  6. Something odd going on right now in TX (and probably other states).

    Deaths down (lagging indicator). Cases down / tests up (leading indicator). But hospitalizations almost perfectly flat. Hospitalizations ought to lag cases, but lead deaths. So what is going on?

    Now the cases/deaths declines are not extremely steep declines. And cases are possibly messy because TX is reporting a lot of back-log old cases not counted in the “new daily”. But still, something seems weird.

    • I think the timing on registration of everything, cases, deaths, tests (maybe not hospitalizations but maybe even that) is so all over the place that it’s hard to pin down leading and lagging based on daily or weekly numbers. There’s just no common timeline upon which things can lead or lag each other in a way that shows up in the trackers.

      • Hmm. Maybe. But I think in early summer cases rose, then hospitalizations, then deaths.

        So if that were why, then would we expect the trend to change soon (IE either hospitalizations to drop, or cases to rise)?

        I also wonder if it could be an issue of defining “COVID related” hospitalizations. Given the possibility of ‘stale’ PCR tests for weeks or even months after infection, if everyone who is admitted to hospital is tested, could that mess things up if there are relatively few currently symptomatic people but many cases in the recent past?

        • I think that would be a reasonable expectation but there’s so many inconsistencies in timing and, as you point out, even in the basic definitions. What counts as “COVID related hospitalization” has changed over time. What counts as a “case” has changed over time. The tests being used have changed over time. It’s even possible (although I have no idea whether it is true in Texas) that the definition of a “COVID related death” has changed over time.

          In effect what you’re looking for is an expected temporal sequence among what are likely non-comparable tallies. We might think that the rate of “hospitalizations” would drop followed by a drop in the rate of “deaths”. But that assumes that each daily or weekly “rate of hospitalizations” has a fixed relationship to the underlying population at risk, same with cases and deaths. I do not think that assumption is valid anywhere in USA over any period longer than a few weeks.

        • Hmm, yeah.

          I was just thinking that back in June cases and % positive rose while deaths were falling, and the people who were predicting that it was just a leading/lagging indicator issue (rather than for example an extremely dramatic drop in IFR*) turned out to be right.

          *I’m sure IFR has dropped somewhat, but deaths did rise significantly in July…

Leave a Reply to Joshua Cancel reply

Your email address will not be published. Required fields are marked *