## Literally a textbook problem: if you get a positive COVID test, how likely is it that it’s a false positive?

This post is by Phil Price, not Andrew.

This will be obvious to most readers of this blog, who have seen this before and probably thought about it within the past few months, but the blog gets lots of readers and this might be new to some of you.

A friend of mine just tested positive for COVID. But out of every 1000 people who do NOT have COVID, 5 of them will test positive anyway. So how likely is it that my friend’s test is a false positive? If you said “0.5%” then (1) you’re wrong, but (2) you’re in good company, lots of people, including lots of doctors, give that answer. It’s a textbook example of the ‘base rate fallacy.’

To get the right answer you need more information. I’ll illustrate with the relevant real-world numbers.

My friend lives in Berkeley, CA, where, at the moment, about 2% of people who get tested have COVID. That means that when 1000 people get tested, about 20 of them will be COVID-positive and will result in a positive test. But that leaves 980 people who do NOT have COVID, and about 5 of them will test positive anyway (because the false positive rate is 0.5%, and 0.5% of 980 is 4.9). So for every 1000 people tested in Berkeley these days, there are about 25 positives and 5 of those are false positives. Thus there’s about a 5/25 =  1/5 chance that my friend’s positive test is a false positive.

(That’s if we had no other information. In fact we have the additional information that he is asymptomatic, which increases the chance. He still probably has COVID, but it’s very far from the 99.5% chance that a naive estimate would suggest. Maybe more like a 65% chance).

If you think about this issue once, it will be ‘obvious’ for the rest of your life. Of course the answer to the question depends on the base rate! If literally nobody had the virus, then every positive would be a false positive. If literally everybody had the virus, then no positive would be a false positive. So it’s obvious that the probability that a given positive is a false positive depends on the base rate. Then you just have to think through the numbers, which is really easy as I have illustrated above.

Apologies to all of you who have seen this a zillion times. Or twice.

This post is by Phil.

1. rm bloom says:

Second question: [1a] how does the fact that the population prevalence is only available as a sample estimate affect this simple calculation? [1b] Generalize: Suppose all terms in “the equation” for post-test probability (specificity, sensitivity and base-rate) are derived from sampling estimators (with sampling distributions for the frequentist or with posterior distributions for the bayesian).

• Mendel says:

[1b] is not factual. Specificity and sensitivity are not based on estimates in the tests where I have reviewed the documentation. For example, testing on 2019 samples ensures that any positives are false.

• rm bloom says:

So, testing on 2019 samples, guaranteed to be negative, will generate some number of false positives right? And therefore the false-positive rate is a ratio *estimate* based on samples drawn from a population. For a frequentist it is a point estimate with an error distribution around it; for a Bayesian it is a distribution for the parameter itself. In either point of view the false-positive rate is a statistical quantity — is it not?

• Exactly—sensitivity, specificity, and prevalence must be estimated in order to derive the positive predictive accuracy Pr[disease positive | test positive]. Therefore, these numbers are not known exactly. Given the sample sizes (order 100), they’re not known well at all—the standard deviation of binomial(100, p) is sqrt(p * (1 – p)) / sqrt(100). That’s a sd of 0.015 if p = 0.02. So you can see the problem.

Known negative cases are from blood banks. Known positive cases are independently verified (how, I’m not sure in the case of antibody tests). Then you can run these through your test site and use the false positive and false negative counts to estimate specificity and sensitivity respectively.

Prevalence is much harder, as it relies on these noisy tests. And it relies on sampling the population. Just assuming prevalence is the proportion of positive tests will not produce the right positive predictive probability estimate for the test, because it doesn’t adjust for sensitivity and specificity.

Different testing sites run their own calibration tests. Andrew and I used the fact that we have seroprevalence testing reports from multiple labs for known positives and negatives to construct a hierarchical model of sensitivity and specificity. We can then use that to predict sensitivity and specificity on a large test sample for new subjects with unknown disease status that was taken at a new facility without any calibration data (the assumption being it is drawn from the same population of testing sites as the ones for which we do have calibration data). Here’s the paper, which JRSS somehow classified as economics (series C).

Gelman and Carpenter. 2020. Bayesian analysis of tests with unknown specificity and sensitivity. JRSS C.

We then show how to combine the calibration tests and population testing to infer population prevalence. We use post-stratification to adjust for the non-random sample. It’s all open source, including the code and data (which was from the Santa Clara study), and it includes a short tutorial I wrote as well as a slide deck:

Diagnostic Testing Web site: https://bob-carpenter.github.io/diagnostic-testing/
GitHub repo: https://github.com/bob-carpenter/diagnostic-testing

But what we do not do is adjust for people opting into testing. This is a huge issue. It’s the who’s-going-to-vote wrench in the works for trying to guess what the population prevalence is.

I’d also like to add a prior correlation between sensitivity and specificity—we didn’t do that in the paper because we didn’t have data from any labs that had both positive and negative calibration tests.

P.S. Positive predicative accuracy is known as “precision” in the ML world, where it’s almost universally estimated as TP / (TP + FP) from system performance on labeled training data. Andrew and Jennifer and Masanao suggest applying a hierarchical model, which has the side benefit of adjusting for multiple comparisons implicitly. I’ve applied this idea to crowdsourcing amateur and expert ratings, which Dawid and Skene (1979) modeled as a noisy measurement problem (i.e, a diagnostic testing problem).

2. The base rate fallacy has been endemic to to the COVID19 from my vantage point, with the caveat that I am not sure I know what I’m talking about. Well…that said.

To answer some of the questions raised requires a very good understanding of the utility of the PCR test, especially as the window for infectivity is estimated to be about 3-7 days. One can test positive even though not able to infect others.

Michael Mina is about the only one that can answer some of the questions raised here. In any case, perhaps the PCR test is not one that appropriate. Some experts in England are anti-mass testing with the PCR test.

3. Another interesting question that has popped up on Twitter is whether asymptomatics that contracted COVID are actually infecting others. The Twittersphere seems to be divided on this and so many other dimensions of COVID transmission

• Phil says:

Jeez, how is the answer not known, a year into the pandemic!

• Navigator says:

To seriously study whether asymptomatic people infect others, we would need to let them infect others. Not sure which IRB board is cool with that.

There are cases showing that one infected person in the same household doesn’t necessarily infect other family members. I’m not sure how it is even possible, as they spent a lot of time together and breathe the same air. I personally know of three families with such ‘issues’

There is so much non-sense based on ‘hunches’ about COVID, it hurts. The idea of a ‘super-spreader’ for example. Every infected person is a super-spreader at the peak of infectiousness. Interpersonal variability in oral cavity size and such is trivial compared to intrapersonal variability during the infection cycle.

The next issue will be which vaccine to choose, as soon as they approve the other two or more.

• Manuel says:

I have had the same experience with at least three couples. Wife gets positive result with no symptoms, husband negative without symptoms as well.
I think that what this tells us is that the proportion of false positives is, as the post explains, very high when the prevalence of the disease is low.

• Joshua says:

I know a couple where the woman was positive with clear symptoms. Man never tested positive AND is negative for antibodies. Go figure.

• Chebyshev says:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2862332/

“ Perhaps the most interesting epidemiological studies conducted during the 1918–1919 pandemic were the human experiments conducted by the Public Health Service and the U.S. Navy under the supervision of Milton Rosenau on Gallops Island, the quarantine station in Boston Harbor, and on Angel Island, its counterpart in San Francisco. The experiment began with 100 volunteers from the Navy who had no history of influenza. Rosenau was the first to report on the experiments conducted at Gallops Island in November and December 1918.69 His first volunteers received first one strain and then several strains of Pfeiffer’s bacillus by spray and swab into their noses and throats and then into their eyes. When that procedure failed to produce disease, others were inoculated with mixtures of other organisms isolated from the throats and noses of influenza patients. Next, some volunteers received injections of blood from influenza patients. Finally, 13 of the volunteers were taken into an influenza ward and exposed to 10 influenza patients each. Each volunteer was to shake hands with each patient, to talk with him at close range, and to permit him to cough directly into his face. None of the volunteers in these experiments developed influenza. Rosenau was clearly puzzled, and he cautioned against drawing conclusions from negative results. He ended his article in JAMA with a telling acknowledgement: “We entered the outbreak with a notion that we knew the cause of the disease, and were quite sure we knew how it was transmitted from person to person. Perhaps, if we have learned anything, it is that we are not quite sure what we know about the disease.”

• Joshua says:

Interesting. Thx.

• Martha (Smith) says:

“We entered the outbreak with a notion that we knew the cause of the disease, and were quite sure we knew how it was transmitted from person to person. Perhaps, if we have learned anything, it is that we are not quite sure what we know about the disease.”

A good lesson to learn (and remember).

• Exactly! Let’s look at the probability that of two people who are negative for the disease, produce a test where one is positive and one is negative. Assuming spec is the test specificity (accuracy on negative cases), then

Pr[test1 = 1, test2 = 0 | spec, both subjects negative for disase]
= (1 – spec) * (spec)

If specificity is 99.5%, that’s a 1% chance (you have to double the above because you can get 0/1 or 1/0 results). A 99%, it’s 2%. At 98%, it’s 4%, and at 95%, it’s about 10%.

Even with a highly specific test, we can expect to see this pattern a lot among pairs of test takers!

• Sean Mackinnon says:

When it comes to the common cold, I (used to before social distancing!) get 3-4 a year. My wife never catches it from me, and has had a cold only a handful of times in the past 15 years or so.

Presumably, some people are more resistant or more susceptible to catching it that others, even beyond the false positive problem.

• Joshua says:

Much of the confusion about this is due to conflation of asymptomatic spread vs. preaumptimmatic spread (which is considerably more prevalent), and some arbitraroness in determining just exactly who is “symptomatic.” Plus, there are many many serial misinformers are leveraging that conflation to downplay the severity of the pandemic.

• rm bloom says:

A careful study following navy crew (SS Roosevelt) between 3/23 and 5/18.
Significant number of positives that remained asymptomatic during the observation period.

https://www.nejm.org/doi/full/10.1056/NEJMoa2019375

“Between March 23 and May 18, 2020, a total of 1271 crew members (26.6% of the crew) tested positive for SARS-CoV-2. An additional 60 crew members had suspected Covid-19 (Figure 1). Of the 1271 crew members with laboratory-confirmed Covid-19, 572 (43.0%) remained asymptomatic throughout the outbreak, 293 (22.0%) were symptomatic at the time that they tested positive, and an additional 406 (30.5%) were presymptomatic at the time that they tested positive. In all, 978 (76.9%) of the 1271 crew members with laboratory-confirmed Covid-19 did not have symptoms at the time that they tested positive, and 699 (55.0%) had symptoms develop at any time during the clinical course.”

• Sam says:

Well it’s very clear that people without symptoms can transmit: they are most contagious a day or two before symptoms. https://www.nature.com/articles/s41591-020-0869-5

• More accurately described as ‘presymptomatic’? It appears though that the window of infectivity is still limited rather than limitless and yet you can test positive even after the infective period has passed. So what I mean is to suggest that when you are tested is a critical query toward understanding the utility of the test result.

• Sam says:

It’s hard to know at the time of the test whether you’re truly asymptomatic or simply presymptomatic. :)

But I agree with what you say.

• rm bloom says:

In the Roosevelt study there is about an 8 week time-line of observation. Indeed you are correct, persons testing positive but asymptomatic toward the end of that time-line are less indicative of the proportion of infectives who *never* become symptomatic. Whereas persons testing positive nearer to the beginning of the observation period who were asymptomatic throughout are more likely to be representative of the class of never-symptomatic infectives. In this situation the estimate of proportion of never-symptomatic infectives is spoilt — biased upward — because of the sensoring problem. Cases need to be followed for much greater lengths of time to mitigate the problem of censoring at the cut-off of the observation period.

4. LJ Beck says:

But doesn’t the 20 out of 1000 (2%) who are covid positive also include false positives?

• Anonymous says:

As I have defined it, no. But actually Berkeley’s test positivity rate is currently 2.28%, so even after accounting for false positives it’s around 2%. But there are also false negatives. There are loads of good sources if you want to understand the nuances. In practice the numbers are never known precisely , and decisions rarely depend on whether the true probability of a person’s infection is 60% vs 70%, so you’re fine doing an approximation. In this case I really just want to make the point that when the base rate is low, a substantial fraction of positives can be false even if the test generates a very small number of false positives per thousand negative patients.

• Navigator says:

It is possible that more virulent but less menacing strains of covid are reaching more and more people. Or some strains that have low viral load, not enough to cause serious issues, but gets picked up by PCR test.

Each time this virus is transmitted b/w two individuals, there is an opportunity for a mutation. That’s the price we pay for being multi-cellular.

Best to look at hospitalizations and deaths. Unfortunately, there is no granular data for hospitalizations.

• Sam says:

Not by the language of the post, which says that 2% of those tested “have covid.”

5. Daniel says:

I was kinda hoping you were gonna use real data so that we wouldn’t have to track it down ourselves :D

6. Al says:

“But out of every 1000 people who do NOT have COVID, 5 of them will test positive anyway.” – Is there a source for this by the way? This seems a lot higher than I would expect from Chinese, New Zealand and Australian data where the proportion of positive tests is frequently well below 5/1000.

My only thought is that perhaps false positives are more likely to occur in labs where they are processing lots of positive tests due to contamination? Or perhaps the performance of lab staff varies substantially between countries?

• Mendel says:

That’s what I heard, too: the major source for false positive PCR tests is cross-contamination.

• Phil says:

I got this from https://www.icd10monitor.com/false-positives-in-pcr-tests-for-covid-19 which says:
Data from External Quality Assessments of PCR tests for COVID-19: FPR between <0.4% and 0.7%, Pooled mean = 0.6% Data from actual use of PCR tests for COVID-19: FPR usually between 0.2% and 0.9% So 0.5% is really just a guess that is consistent with these numbers, rather than being a known quantity. Perhaps I should have emphasized this more. The false positive rate is not actually known, nor is the false negative rate, nor is the base rate.

• What does “usually between” mean? When you start accounting for the uncertainty in those estimates rather than using point estimates, it really expands the posterior inferences. The guidelines I’ve been given in some consulting work are much lower in terms of sensitivity and specificity than the well calibrated lab results often reported.

And as that article points out, you really want to adjust for symptomatic status in the tests, as we expect the baseline prevalence to be different for those with and without symptoms. But then there are also lots of people with symptoms of colds or ordinary flu or who are just worn out. So you really need to know the type of symptom and its severity — blood oxygen down to 0.9 is a very bad symptom; a runny nose in winter not so much.

7. MJM-WA says:

Another important question to ask in this context now appears to be cycle/Ct count used by the test processing lab/group. From what I have read on Tweeter, it sounds like the cycle counts vary widely across processors and now even countries. The NFL in the US lowered the cycle count on its testing program, as did Germany. I have wondered if any Asia-PAC countries have done the same, or any states in the US. I now know several people who have tested C19 pos and who have had no symptoms whatsoever. While I have seen efforts on Tweeter to analytically grapple w/ this, I have no idea about the quality of the analyses. They are asking a very valid question, however, and it is one that is long overdue for serious consideration IMHO.

8. Chebyshev says:

Thanks Phil.

For me the most interesting part is how one interprets a probability statement when it comes to an individual i.e. what does “80% chance of correct positive” really mean here?

• Navigator says:

On individual basis, you either get the virus or not. Binary.

Nobody dies in a plane crash with probability of o.oooo456. They either die or not.

That’s one of the less cheerful aspects of probability.

• Chebyshev says:

Actually, that’s the most interesting part :)
Rest is just algebra (or calculus).

• The world is a complex place, and we only measure a tiny fraction of the state of the world.

One of the key aspects of probability is missing information. We’re missing information about for example whether there is a positive test because of lab contamination, because of sample contamination, because of manufacturer contamination, because of instrument malfunction, because of … etc etc.

But still if we were to enumerate all the possible states of the world which led to whatever was observed (which is bazillions of different possibilities) 80% of them involve a sick patient, and only 20% involve some other explanation.

• Anoneuoid says:

The C.D.C.’s own calculations suggest that it is extremely difficult to detect any live virus in a sample above a threshold of 33 cycles. Officials at some state labs said the C.D.C. had not asked them to note threshold values or to share them with contact-tracing organizations.

For example, North Carolina’s state lab uses the Thermo Fisher coronavirus test, which automatically classifies results based on a cutoff of 37 cycles. A spokeswoman for the lab said testers did not have access to the precise numbers.

This amounts to an enormous missed opportunity to learn more about the disease, some experts said.

“It’s just kind of mind-blowing to me that people are not recording the C.T. values from all these tests — that they’re just returning a positive or a negative,” said Angela Rasmussen, a virologist at Columbia University in New York.

“It would be useful information to know if somebody’s positive, whether they have a high viral load or a low viral load,” she added.

Officials at the Wadsworth Center, New York’s state lab, have access to C.T. values from tests they have processed, and analyzed their numbers at The Times’s request. In July, the lab identified 872 positive tests, based on a threshold of 40 cycles.

With a cutoff of 35, about 43 percent of those tests would no longer qualify as positive. About 63 percent would no longer be judged positive if the cycles were limited to 30.

In Massachusetts, from 85 to 90 percent of people who tested positive in July with a cycle threshold of 40 would have been deemed negative if the threshold were 30 cycles, Dr. Mina said. “I would say that none of those people should be contact-traced, not one,” he said.

https://www.nytimes.com/2020/08/29/health/coronavirus-testing.html

That’s pretty much still the best info out there on the topic. Like I’ve been saying since the spring. It will eventually be accepted these tests are about 70% false positives and 70% false negatives.

• Anoneuoid says:

CDC still guesses 40% (10-70%) of “infections” are truly asymptomatic:

§ The percent of cases that are asymptomatic, i.e. never experience symptoms, remains uncertain. Longitudinal testing of individuals is required to accurately detect the absence of symptoms for the full period of infectiousness. Current peer-reviewed and preprint studies vary widely in follow-up times for re-testing, or do not include re-testing of cases. Additionally, studies vary in the definition of a symptomatic case, which makes it difficult to make direct comparisons between estimates. Furthermore, the percent of cases that are asymptomatic may vary by age, and the age groups reported in studies vary. Given these limitations, the range of estimates for Scenarios 1-4 is wide. The lower bound estimate approximates the lower 95% confidence interval bound estimated from: Byambasuren, O., Cardona, M., Bell, K., Clark, J., McLaws, M. L., & Glasziou, P. (2020). Estimating the extent of true asymptomatic COVID-19 and its potential for community transmission: systematic review and meta-analysis. Available at SSRN 3586675. The upper bound estimate approximates the upper 95% confidence interval bound estimated from: Poletti, P., Tirani, M., Cereda, D., Trentini, F., Guzzetta, G., Sabatino, G., Marziano, V., Castrofino, A., Grosso, F., Del Castillo, G. and Piccarreta, R. (2020). Probability of symptoms and critical disease after SARS-CoV-2 infection. arXiv preprint arXiv:2006.08471. The best estimate is the midpoint of this range and aligns with estimates from: Oran DP, Topol EJ. Prevalence of Asymptomatic SARS-CoV-2 Infection: A Narrative Review [published online ahead of print, 2020 Jun 3]. Ann Intern Med. 2020; M20-3012.

https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html

In May that was 20-50%:

https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios-archive/planning-scenarios-2020-05-20.pdf

• Thanks Anoneuoid. That was an informative post. I know that the former Pfizer executive and researcher, Yardley Yeadon [real name Mike Yeadon] has been tweeting about the PCR test and conveying that contamination may be pervasive. Much more visceral discussion in England, I must say. Mike Yeadon may be a signatory of the Great Barrington Declaraion. He also questions whether those that get COVID should be vaccinated. Here in US some subsets broach this subject too.

• Joshua says:

Sameera –

I suggest you look at this thread in Yeadon. I note the explicit caveat regarding Yeadon’s motivations.

Yeadon looks to me like part of a group of mis-informers. Of course, I likewise, can’t assess motivations.

• Hey Joshua, I did read through Andrew Croxford thread. Croxford does agree some of Mike Yeadon’s hypotheses. He doesn’t share Yeadon’s dismal assessment of the PCR test.

I have been wondering whether the characterization ‘false positive’ is an accurate one in the case of a PCR test. Why? For one thing the PCR may continue to yield a positive result even after the infective period has waned or ceased. So is it correct to characterize it as ‘false’ in that specific context? I don’t think Croxford’s response addresses that point adequately. Or I’m entertaining the wrong question.

The appropriateness of the test is at issue -if and when the infective period is no longer of real concern as evidenced by the CT values. Note that these CT values are not within the privy of either patients or physicians.

Now I am often circumspect when any scientist front load bold adjectives to describe any treatment or method. I approach expert opinions as a consumer rather than an as an expert. My real passion is Lifestyle Medicine; Diet and Exercise.

I am not sure what is going on in England anyway. I do speculate that we have been undergoing different rates of ‘herd immunity’in different locales since December or January. Robert Redfield the CDC Director estimated in June that there were perhaps 60 million or more who had been infected in US. Not sure how he calculated that. Now there are estimates of about 140 million. If accurate, by June of this year, we may acquire another 100 million toward it if the vaccine also is distributed to another 60 million as well. This is all guesswork. But good news if evidenced to be correct.

In short, I do think way more people were infected than current figures show.

• Joshua says:

Hi Sameera –

> I am not sure what is going on in England anyway. I do speculate that we have been undergoing different rates of ‘herd immunity’in different locales since December or January.

The term “herd immunity” gets thrown around a lot. It’s a problem. Technically,I think, it should refer to the point where you hit a threshold that the infection rate would decrease in a naive population taking no measures, and you have relatively little chance of encountering an infected person

Of course, infection rate might drop for any variety of reasons short of that threshold – for example behavior changes or because of interventions. And of course, if a large number of people are infected it will START to reduce the rate of infection to some degree.

But Thera are so many aspects of uncertainty we have communities where infection rates may have been as high as 60%, and yet people think that “herd immunity” has come into play when infection rates are only a fraction of that. I think it’s best to avoid trying to use the term “herd immunity” to describe what’s going on. People thought that Stockholm reached “herd immunity” seven months ago, only to see the infection rate shoot through the roof since October. Now they want to say that “herd immunity” can jump back and forth over the threshold depending on season and behaviors and other factors. At some point, the term becomes rather meaningless, IMO. It’s more just a description of the current rate of spread. Why not just state the current rate of spread instead of trying to categorize it with some moving definition that varies based on context?

Did I already link this article for you? There some small discussion of likely numbers infected versus numbers tested positive.

https://www.theatlantic.com/health/archive/2020/11/coronavirus-death-rate-third-surge/617150/

• Joshua,

Thanks for link to article The definition of herd immunity, at least among the prominent immunologists, is pretty similar. They are the one to whom I turn.

• Joshua says:

Sameera –

> The definition of herd immunity, at least among the prominent immunologists, is pretty similar. They are the one to whom I turn

So then what do you mean when you say “different rates of ‘herd immunity’ in different locales?” I’m not sure what you mean by a “rate of herd immunity.”

• Ben says:

> It will eventually be accepted these tests are about 70% false positives and 70% false negatives.

Can you unpack these numbers a bit? Are these in the units that Phil’s friend could use “I got my test back positive, therefore there is a 30% chance I have covid”, or is there a base rate or something that needs to interact here?

The way I’m interpreting this is 70% of positives people read are false positives, and 70% of the negatives people read are false negatives.

If there have been like 200 million negative tests in the US, would this say there are 140 million actual covid positives?

• Navigator says:

I believe those big false positive/negative numbers are only correct when you compare it to antigen tests, not PCRs on their own.

• Anoneuoid says:

The gold standard is to take a sample, expose a cell culture to it, then see the cells lyse and when you take the supernatant you find more copies of the viral sequence than you put in.

That would indicate there is currently infectious virus in the sample that could infect someone else. No one is doing that.

• Anoneuoid says:

Given a positive test there is ~30% chance something meaningful is going on. Ie, you could transmit to someone or suffer some symptom you would notice otherwise.

Likewise, given a negative test, there is ~70% chance you could still transmit to someone or suffer a symptom you would attribute to covid if the test had been positive.

• Ben says:

Okay so it’s not actually false positives or false negatives — you’re asserting that positive PCR means 70% chance the person doesn’t transmit and that a negative PCR means there’s a 70% chance that someone does transmit or has a covid-like symptom. That second thing is really confusing for me to try to break apart in my head.

There’s gotta be some mechanisms you’re assuming that take us from what a positive PCR means to transmission, like maybe some sort of range of infectiveness.

And then why are those percentages symmetric? The outcomes you’re predicting here aren’t the same, and so presumably the mechanisms that drive them are not the same, so why both 70%?

And also isn’t there a big base rate problem (and a bunch of assumptions about the population getting tested)? Like there have been large changes in the base rates over the year, so presumably the information available about base rates in the spring is different than the article you’re talking about is saying July in Massachusetts, and now it’s December.

• Anoneuoid says:

The 70% is just a rough guess based on early reports I read. A positive/negative test result doesnt really mean anything unless you know the real world rates of poor swabbing procedures, sample handling, contamination, etc. Then theres the arbitrary cutoff in what counts as a positive. Even the WHO has apparently finally figured this out:

Users of RT-PCR reagents should read the IFU carefully to determine if manual adjustment of the PCR positivity threshold is necessary to account for any background noise which may lead to a specimen with a high cycle threshold (Ct) value result being interpreted as a positive result. The design principle of RT-PCR means that for patients with high levels of circulating virus (viral load), relatively few cycles will be needed to detect virus and so the Ct value will be low. Conversely, when specimens return a high Ct value, it means that many cycles were required to detect virus. In some circumstances, the distinction between background noise and actual presence of the target virus is difficult to ascertain. Thus, the IFU will state how to interpret specimens at or near the limit for PCR positivity. In some cases, the IFU will state that the cut-off should be manually adjusted to ensure that specimens with high Ct values are not incorrectly assigned SARS-CoV-2 detected due to background noise.

Manufacturers regularly review the design of their product, including labelling and IFU based on customer feedback. In the early phases of the COVID-19 pandemic, in vitro diagnostics (IVDs) were rapidly developed, validated and verified, and then rolled out. Therefore, it is not unexpected that IVDs may require refinement based on user feedback after their introduction at scale. Users should verify the version of the IFU with each consignment they receive to see if any changes have been made to the IFU.

https://www.who.int/news/item/14-12-2020-who-information-notice-for-ivd-users

As Ive said from the beginning, these tests came from nowhere without public discussion and never deserved to be any kind of gold standard.

Here is Fauci talking about it months ago: https://m.youtube.com/watch?t=260&v=a_Vy6fgaBPE&feature=youtu.be

• Joshua says:

The problems and questions are real.

Nonetheless, hospitalizatons, ICU admissions, and deaths rise in a steady rate of correlation with a rise in positive PCR tests.

People who were downplaying the severity of the pandemic have been claiming that positive test results could be decoupled from illness and death since the summer – and are continuing to do so despite having been wrong over and over for months.

• Ben says:

> The problems and questions are real.

Yeah, but if we’re gonna start throwing out PCR as bad data cuz Ct whatever, then it seems reasonable to understand how we got there.

We started with:

> It will eventually be accepted these tests are about 70% false positives and 70% false negatives.

And it turned out that wasn’t really the case:

> Given a positive test there is ~30% chance something meaningful is going on.

And then it turned out that even this isn’t the case:

> The 70% is just a rough guess based on early reports I read. A positive/negative test result doesnt really mean anything unless you know the real world rates of poor swabbing procedures, sample handling, contamination, etc

So these are just rough guesses, but definitely a positive/negative test doesn’t mean anything and I can take that to the bank?

Anyway, not terribly compelling argument to throwing out PCR, which I assume is the goal, but the other bits keep changing so maybe not.

This whole discussion of Ct makes me want to read how PCR machines work. These sortsa little robot things are neat.

• Joshua says:

And yet, as positive tests go up, so do hospitalizatons, ICU admissions, and deaths

• sparkles says:

Likely because the overwhelming majority of people who get tested have a prior of infection.

• Joshua says:

How does people getting a positive result because of prior infections explain why COVID illness and death consistently rise, proportionally, along with the rise in positive test results?

• What does 80% chance of rain mean? It’s either going to rain or it isn’t. What does 50% chance of heads or tails in a coin flip mean? It’s either going to land heads or it isn’t.

Maybe Laplace’s demon could take the information we have now and predict short-term weather or the result of a coin flip, but it’s beyond people. So we encapsulate the uncertainty in our predictions using probabilities. That’s the Laplacian view of probabilities as being epistemological rather than ontological—they’re about what we know.

What a frequentist would say is that the probability of rain is the proportion of days just like today on which it would rain. Coin flips are the same deal. In this view, you need to be able to repeat days like today indefinitely for this conception to make sense.

• rm bloom says:

If the weather today when compared with a long history of “todays” and we discover it rained on 80% of those days then we calibrate our *expectations* appropriately. Conversely if Joe is rational and he tells us, because he has just calibrated his expectations in some rational manner (but he does not or cannot say exactly *how* he did that); and — at any rate — he now says he is 80% confident it will rain today; we should suppose (if we trust his claim that he arrived at his 80% figure in a ‘reasonable’ way) that he did something much like we did; he compared ‘today’ with a history of ‘todays’. My history of ‘todays’ may be different from his; but if he and I are equally reasonable, how else could we ‘reasonably’ calibrate our expectations? We form our sets of relevant evidence and look for the pattern in it. Our reference sets may not be the same; they may be of differing quality; his or mine may not be susceptible to tallying up counts of dry and wet days — we may be forgetful, we may be inaccurate; but he and I must be approaching the matter in substance the same way. If Joe says he arrived at 80% by looking back and counting (or roughly estimating) the number of Tuesdays (suppose it is Tuesday today) on which he had Sauerkraut and Potato salad with supper; I would say, well, Joe may have come up with the same figure 80% as I did, but though he *says* he’s estimated the probability of rain, I cannot accept that is what has done! For he’s accidentally or perversely gathered up irrelevant evidence. What is the point of this long digression: the probability of a single event may be directly or indirectly estimated from a ‘statistical history’ — whether that history is coarse or fine, whether the counting is on paper, or only in the crudest sense, confined to our own more or less hazy memories of what we’ve seen and where we’ve been. But if the estimate is *reasonable* then it must be based on *relevant* evidence and the ‘subjective’ probability is no different from the ‘objective’ one; the former is based on Joe’s reference class (where he’s been and what he’s seen) and the latter is based on the weather-service’s logs (what they’ve seen where they were). If the weather service rain logs got mixed up with the Geological survey’s earthquake logs and got re-labeled accidently, then estimates of rain based on the fumbled logs would be just like Joe’s estimate of rain based on Sauerkraut and Potato salad suppers.

• rm bloom says:

Yes, “they’re about what we know”, but: what we know about *rain*!

9. Patrick Linehan says:

What were your friend’s symptoms? These matter when you calculate the pretest probability.

If your friend had a known COVID contact a week ago, fever, headache, anosmia (loss of the ability to detect scent), hypoxia, and a bunch of patchy infiltrates on the chest x-ray would you still argue their COVID test had a 20% chance of being a false positive?

On the other hand if your friend was entirely asymptomatic and has been physically distancing the false positive rate is probably greater than 20%.

• Mendel says:

A typical PCRtest machine in a big lab will process a tray of 96 or 144 samples at once, this includes controls.
The machine is preset to a specific number of cycles that are the same for every sample. Obviously this number does not allow any inference on how infectious a sample is.

In a diagnostic setting to determine the viral load of an infected patient, their sample is processed at various numbers of cycles, and the threshold number of cycles at which the process registers the infection is a measure of the viral load at the time that the sample was taken. So this makes sense if you are studying infectiousness in a patient over time, with samples taken every day or every 12 hours. Obviously, this requires much more effort, since every sample is tested many times at var ious cycle values.

For a simple diagnostic of whether a patient is infected or not, all that effort would be useless: since there is usually just one sample, and it could be taken at any point during the virus life cycle, it doesn’t even tell you the patient’s peak infectiousness. There’s also sampling variety depending on how skillfully the sample was taken, and the length of time (and the temperature) that the virus was stored for also affects this. So instead of wasting these extra tests on getting a CT that doesn’t mean anything, the test capacity is used to test more people.

The CT value that a lab uses depends on their equipment and the materials they’re using.

• Phil says:

As I said in the post: “(That’s if we had no other information. In fact we have the additional information that he is asymptomatic, which increases the chance [that his result is a false positive]. He still probably has COVID, but it’s very far from the 99.5% chance that a naive estimate would suggest. Maybe more like a 65% chance).

10. Mark Gilmour says:

What type of test? NZ has performed circa a million tests, all PCR. We have some (suspected) false negatives, but there has been literally zero reporting of false positives, and I can be damn sure of this because a single positive case in the community is a massive national news story.

Do they automatically perform more testing on positive samples to rule out false positives, or is PCR truly immune to false positives and your friend is using an antibody test?

• Z says:

I would like to see a reference for that .5% number as well.

PCR tests can have false positives from cross contamination. This wouldn’t happen in NZ because there are so few positive samples that might contaminate the negative ones. I have no idea how often it happens in the US (would seemingly depend on prevalence and the lab).

• Phil says:

I have a link on another comment.

It’s an important point that cross-contamination it the cause of false positives. This suggests the false positive rate should depend on the base rate, an interesting complication.

11. Edward says:

– we don’t know the real-world accuracy of the PCR test.
– we don’t know the actual prevalence of COVID-19 in the general population.
– we are thus guessing.

– Dead bodies are much easier to count (COVID-19) mortality rate).
– But CDC rules count any death ‘with’ any COVID-19 diagnosis as a death ‘from’ COVID-19.
– U.S. heart disease death rate decreased dramatically this year; that has never happened before. Why?

• anon e mouse says:

Your second to last bullet point is something I usually hear from conspiracy theorists…

https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm

Nearly impossible to come up with a plausible explanation for this ^^ that doesn’t involve a LOT of COVID mortality.

• Brent Hutto says:

I doubt even the most hard-core tinfoil hat wearer denies that COVID has been responsible for “a LOT” of mortality.

There’s room for two things to both be true: a) yes, COVID is very prevalent and very deadly and b) yes, CDC and other supposed authorities have been full of b.s. on their definitions for the entire pandemic.

• Joshua says:

Brent –

> I doubt even the most hard-core tinfoil hat wearer denies that COVID has been responsible for “a LOT” of mortality

If you looked around in places like Twitter or rightwing blogs, you’d find that’s not even close to true.

On the grand scale it’s hard to say how many people that actually represents. Social media is rather an outlier group. But within that world the view that COVID is basically a big hoax is pretty prevalent.

• Carlos Ungil says:

> I doubt even the most hard-core tinfoil hat wearer denies that COVID has been responsible for “a LOT” of mortality.

How much is “a LOT”? Around 300’000 deaths in the US as reported? Less than 250’000? More than 350’000?

• Joshua says:

In case someone hasn’t been looking, it’s easy to find people arguing that there haven’t been excess deaths due to COVID. One guy, Michael Levitt gets a lot of traffic on Twitter. He was in a team that won a Nobel I computational chemistry a while back.He recently claimed, when the EU was reporting 2K deaths a day, that they were experiencing “no deaths” based largely on the “excess deaths” line of reasoning. That line of thinking isn’t particularly hard to find if you look. I don’t exactly recommend liking. Many claim that COVID deaths are actually flu deaths, or deaths of people who are actually dying of other causes.

A popular claim is that only something like 9% of the reported COVID deaths are actually deaths from COVID.

• Brent Hutto says:

If I had to guess, my supposition is there’s been more room for undercounting than overcounting. So something in the 300,000-350,000 range seems believable to me. Or maybe even 400,000 or maybe even 250,000. The uncertainty is huge.

Given the truly abysmal shortcomings in basic epidemiological surveillance methods, I have no doubt at all that both substantial undercounting AND overcounting have both been occurring. So the true numbers could be anywhere in that general range and we’ll probably never know.

About the only way of assessing the overall impact of COVID-19 is to simply look at all cause mortality for the period March, 2020 to present and compare it to similar periods in the recent past. Whether due to people dying “of” COVID or people dying due to the disruptions or people not dying of other things because they’re not [fill in the blank usual risk-entailing activities] when you sum it all up the difference in all cause mortality this year vs. normal years might as well be ascribed to COVID directly or indirectly. That has the advantage of not be subject to manipulation by ever-changing case definitions or cause ascertainment!

• Carlos Ungil says:

I agree that the uncertainty is huge. Let’s take a relatively wide range of 200’000-400’000 as reasonable. As Joshua mentions, it’s not hard to find people who says that true covid deaths (if any!) are well below that range. I even suspect some people here may object to the 200’000-400’000 range.

• Carlos Ungil says:

Btw, I wonder what may be Ioannidis’ estimate of “deaths by covid” nowadays. It could fall outside that range.

In his “fool’s confession and dissection of a forecasting failure” he explained that one of the reasons why his prediction of “fewer than 40,000 deaths this season in the USA” was so wrong is that not only “COVID-19 deaths” are included in the count but also “deaths with COVID-19” and even deaths “without COVID-19 documented”.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7447267/#appendix

• I had says:

I had Not seen that Ioannidis “confession.” What a jackass. Just cannot admit he was wrong about the virus, it’s all somebody else’s fault that his estimate was so wrong. Infuriating.

• Andrew says:

I:

We discussed the Ioannidis, Cripps, and Tanner article here. The “fool’s confession” part seems to have been added after my review. Here’s what Ioannidis wrote there:

“If I were to make an informed estimate based on the limited testing data we have, I would say that COVID-19 will result in fewer than 40,000 deaths this season in the USA” – my quote appeared on April 9 in CNN and Washington Post based on a discussion with Fareed Zakaria a few days earlier. . . . when he sent me the quote that he planned to use, I sadly behaved like an expert and endorsed it. Journalists and the public want certainty, even when there is no certainty.

Here is an effort to dissect why I was so wrong. Behaving like an expert (i.e. a fool) was clearly the main reason. . . .

I don’t understand that equating “an expert” with “a fool.” An expert can know to express uncertainty, right? Maybe “behaved like a pundit” would be a better way of putting it. I don’t see why “behaving like an expert” should imply overstatement of uncertainty.

Also in that section, Ioannidis writes:

I have no personal social media accounts – I admire people who can outpour their error-free wisdom in them, but I make a lot of errors, I need to revisit my writings multiple times before publishing, and I see no reason to make a fool of myself more frequently than it is sadly unavoidable.

As a person with a social media account (this blog!), I disagree with the implication that social media is for “error-free wisdom.” I make errors all the time! That’s ok. The key is to not claim undue certainty.

• The research environment has been so hypercompetitive that it has distorted our efforts to do science. I’m sure that John’s original 2005 PLOS article was not so welcome in some circles. That has dogged his reception to this day among a few subsets. Nevertheless John Ioannidis is to be commended for some original insights and the ability to reach many audiences.

Actually I have disagreed with a couple of John’s stances. It hasn’t depreciated my respect for his scholarship. Same goes for the many super talented bloggers here, and for Andrew’s original and enormous contributions to so many subject areas. Thanks.

• Joshua says:

Long after that sorry-assed prediction of 40k deaths, Ioannidis went on a massive media campaign and said that Covid was basically like the seasonal flu. Long after that he said the IFR was much lower than many others estimate it to be. He has systematically treated uncertainties in an selective manner (only in a way that lowers his estimate of the impact of the pandemic). From his work with the Princess Diamond estimate forward, he extrapolated from unrepresentative samples.

Saying he was wrong but wasn’t really wrong but he had an excuse for why he was wrong a and if was someone else’s fault and the media’s fault because they didn’t public what he wrote and because he was actually right seems about par for the course here.

• Chris Wilson says:

Here is Ioannidis being interviews on April 23rd to hype the Santa Clara antibody study on a YouTube channel that seemed to host many other – more dubiously qualified – ‘skeptics’ on Coronavirus.
He asserts several times the risk for the large majority is no more than flu, strongly implies that most of the measures are unwarranted and causing more side effects than the disease, etc etc.
Self reflection is hard, but I find his attempted mea culpa unconvincingly done. He’s made great contributions, but whiffed this one and can’t seem to own it.

• confused says:

@Joshua “The problem isn’t so much with the methodologies as with the need to have the calculations accurate in a very short time frame. As Sander writes below, there are a lot of moving parts.”

yeah, three years from now we will probably have a much better idea of what happened this year.

What I really want to see is an explanation for the “longitude” effect – all the high-mortality-rate (per capita*) nations are in Europe and the Americas, and the pattern seems to cut across nations of wildly different wealth, government systems, and responses. The low death rates in places like South Korea, Japan, and Australia it can be attributed to a combination of response plus a degree of isolation allowing responses to work better than elsewhere… but why have less-developed nations in tropical Asia and Africa been hit so much less hard than tropical America?

• Joshua says:

confused –

You’re thinking an “effect” from longitude rather than culture or commonality of some combination of variables?

I dunno. South Korea had superspreader events. But they dealt with them very efficiently. They’re currently having a spike, suggesting similar dynamics in play but because their baseline is so ridiculously low compared to ours it winds up looking so totally different. The Asian countries, with a shared Confucian heritage, have a quite different attitude towards authority and responsibility to society over individuality. New Zealand/Australia obviously have certain structural advantages. Kiwis have a strong sense of social responsibility. Aussies have a wide irreverent/anti-authority streak…but it still seems they had a lot of compliance with interventions?

• Joshua says:

Africa, wouldn’t age be an obvious explanatory variable?

• Re: ‘Given the truly abysmal shortcomings in basic epidemiological surveillance methods, I have no doubt at all that both substantial undercounting AND overcounting have both been occurring. So the true numbers could be anywhere in that general range and we’ll probably never know.’
—-
This situation is hard to fathom given how many epidemiologists there are on the globe. There has to be a good data hub somewhere.

• Sander says:

The core data problem has nothing to do with epidemiologists – it is a medical-recording problem with the frontline diagnoses and completions of death certificates. Those are up to attending physicians, coroners, medical examiners or other parties in charge of recording ’causes of death’, who are often elected or appointed officials. In many if not most jurisdictions a coroner need not be medically trained, and even physicians show large variations in recording tendencies. There is no agency that has established personnel, diagnostic, or recording standards or criteria adequate to cope with the current pandemic across the 3,100+ U.S. counties, parishes etc. let alone the globe. This problem reflects a defect in vital statistics that has been lamented for decades and for which there is no clear politically feasible solution.

• Sander, I was not trying to assign the data problem on to epidemiologists. But is it not the case that researchers, including epidemiologists, draw on vital statistics even though they are defective? Congress has recently passed an Evidence Based Policy Act. So maybe the vital statistics criteria and standards can be addressed by the government I hear that there is a big push to improve data. Occasionally, I get an invite to forum that hosts data scientists and AI techies. There appear to be some good analyses.

• Joshua says:

Of course, after time then epidemiologists can take samples and validate ways to calculate for errors. The problem isn’t so much with the methodologies as with the need to have the calculations accurate in a very short time frame. As Sander writes below, there are a lot of moving parts.

• Sander Greenland says:

Sameera: Most of what I see in the press regarding vital statistics for covid is oblivious to the problems. Good epidemiologists hedge their comments in recognition of them. Whe seem to be in a year where the public finally appreciates that ‘epidemiologist’ is not ‘dermatologist’; but unfortunately the number of researchers (or pundits presented as researchers) who are quoted as ‘epidemiologists’ by the press seems to have expanded many fold, and those ‘excess epidemiologists’ largely ignore these problems – in particular the uncertainty those problems add to estimates of excess mortality ‘due to’ covid-19 directly vs. those due to social reactions to pandemic effects.
That said, I don’t know about the analyses you refer to or how you judged them ‘good’.

• RE: ‘hat said, I don’t know about the analyses you refer to or how you judged them ‘good’.’

I meant to write ‘analysts’. Some of the presentations have delved with the problems with data; in particular the lack of access to pertinent data in many fields. I characterize these analysts as good for identifying the analytic and cognitive biases that operate in research. Of course conflicts of interest are endemic. Discussing them is a good start.

Andrew and you have frequently presented on the quality of data in your respective fields. Nevertheless, we know that within each field there are stellar, not so stellar, and really awful thinkers. That is a concrete reality.

• Sameera;
I’ve a simple question that maybe you know the answer to: If you have antibodies to the common cold coronavirus, will you test positive for Covid-19? Does anyone know?

• Sander what further proof do you need of ‘good’ since I think you are weely weely weely stellar thinker. LOL

So no more doubt that I am capable of evaluating the ‘good’.

• Hi Deborah. Nice to see you here.

Re: I’ve a simple question that maybe you know the answer to: If you have antibodies to the common cold coronavirus, will you test positive for Covid-19? Does anyone know?
——-

As you recall, I tweeted this same question a couple of weeks ago. It hasn’t been answered by anybody. I’m a little surprised that it hasn’t. What I do accept is that if you have had one of the other coronaviruses, you may experience less severe symptoms and may have T-cell immunity.

An additional question I have had is if you do contract COVID19, do you need to get a vaccine. I gather from Michael Mina that you may not need to vaccinate. I think it would be smart to get an antibody test in any case, recognizing that some questioed the accuracy of the antibody test circulating this last spring.

• Just to re-affirm Sander’s point “There is no agency that has established personnel, diagnostic, or recording standards or criteria adequate to cope with the current pandemic”

It is alarming and palpable in my social interactions with former schoolmates who went into epidemiology 20+ years ago and are currently in position to have access to more information than most and they are all missing information on different aspects of what actually needs to be thought through. It’s just a general failing of the scientific community to communally enable their members to learn how they are wrong about important aspects. It takes infrastructure and channels, there not there and they can’t just be built in a month?/year?

• Joshua says:

Sameera –

> and may have T-cell immunity.

What do you mean by “T-cell immunity?”

If you mean that because of T-cell reactivity you won’t get infected, you are likely mistaken. That is a likely mistaken belief that is being spread around by people who are diminishing the seriousness of the pandemic. Those who actually study what T-cell reactivity comprises say that it might lead to less severe disease but doesn’t likely protect people from getting infected.

• Joshua says:

Sameera –

Also this:

https://www.nature.com/articles/s41577-020-00460-4

• Ben says:

> I’ve a simple question that maybe you know the answer to: If you have antibodies to the common cold coronavirus, will you test positive for Covid-19? Does anyone know?

This question doesn’t seem simple, but I could just be saying that because you say it is simple. For instance, why are we asking this?

The question is simple only if I interpret it very literally. Does anyone know? The first sentence asks, does A imply B, where A and B are both measured sorts of things that we could argue about.

So does anyone know A implies B? No, because A and B aren’t even known absolutely themselves.

• Joshua,

Immune Cells for Common Cold May Recognize Sars-CoV-2

https://www.nih.gov/news-events/nih-research-matters/immune-cells-common-cold-may-recognize-sars-cov-2

““We have now proven that, in some people, pre-existing T cell memory against common cold coronaviruses can cross-recognize SARS-CoV-2, down to the exact molecular structures,” Weiskopf says. “This could help explain why some people show milder symptoms of disease while others get severely sick.”

“It still remains to be addressed whether this immune memory reactivity influences clinical outcomes and translates into some degrees of protection from more severe disease,” adds Sette. “Having a strong T cell response, or a better T cell response may give you the opportunity to mount a much quicker and stronger response.”

• Anoneuoid says:

What about the healthcare system treating people differently if they test positive?

• Navigator says:

Edward,

That’s the case with any illness. People don’t die ‘from’ flu either, but because of complications initiated by flu. Just look at how many survive flu each year.

COVID has a few signature symptoms (low O2 saturation, dry cough, absence of smell, etc.) but whether it’s flu, COVID or anything else, CDC lumps it all in ILI (influenza-like illnesses).

Only after several positive, albeit imperfect tests have been done, it is sub-categorized as COVID, flu, whatever.

Human body reacts to invaders in very similar ways in most cases, thus the confusion.

12. Stephen Senn says:

Yes and no. This analysis assumes that sensitivity and specificity are fixed parameters that do not vary with prevalence. In other words, it takes sensitivity and specificity as primitives.
There is, however, a long tradition of citicising this view as naive. See, for example

1. Dawid AP. Properties of Diagnostic Data Distributions. Biometrics. 1976;32:647-658.
2. Miettinen OS, Caro JJ. Foundations of Medical Diagnosis – What Actually Are the Parameters Involved in Bayes Theorem. Statistics in Medicine. 1994;13(3):201-209.
3. Guggenmoos-Holzmann I, van Houwelingen HC. The (in)validity of sensitivity and specificity. Statistics in Medicine. 2000;19(13):1783-1792.

The third of these https://www.researchgate.net/publication/12456501_The_invalidity_of_sensitivity_and_specificity
has a title that points to the problem.
As the authors put it: “Bayes theorem does not care on which one of two events we condition. That is, predictive values may be expressed in terms of accuracy and prevalence, and accuracy parameters such as sensitivity may be expressed in terms of predictive values and the probability of a positive test result.”

Note, that I am not necessarily disputing the analysis in this blog. It may be valid in this case. It is just that it is not generally valid. Ever since I first encountered the counter-argument (through Miettinen & Caro’s paper & then through Dawid’s and finally G-H & van H) I have been wary about the approach presented here.

• That seems to be the usual case in any statistical topic’s assumptions, there is always another mountain behind any mountain or inexhaustible refinements of the assumptions.

But people have to start somewhere and though I believe this the best example of starting folks out on this topic’s assumptions https://www.youtube.com/watch?v=XmiEzi54lBI&feature=youtu.be it overlooks the possible dependencies you bring out here.

Certainly appropriate to bring that out for this audience, so thanks. (I did run into it when transporting sensitivity and specificity from a referral clinic to a general population.)

• Stephen Senn says:

Thanks. Clinic to general population would be a very dangerous case. It may be that for random testing it’s OK (roughly) to assume constant specificity and sensitivity (but given that the prevalence may be changing rapidly one should be cautious) but diagnosis and prevalence assessment are not the same thing. What I worry about (as a complete amateur) is how they estimate these parameters in the first place.

• Used one of the methods from here – Intervals for posttest probabilities: a comparison of 5 methods. D Mossman, J O Berger https://pubmed.ncbi.nlm.nih.gov/11760107/

Paper was published, but I have never been able to obtain a copy and can’t remember much other than originally the study group insisted I not do anything fancier than calculate a correlation coefficient. When I suggested something else was really needed, they tried to replace me with another statistician. When they were unable to do that, they started to come around.

• Phil says:

If the false positive rate depends on the base rate (due to cross-contamination) that could change things a lot. It also seems like something that could change with time as testing facilities learn and improve, or could get worse as they get overwhelmed or as employees get fatigued.

• Wiley. And paywalled. Can you summarize the point? The abstracts are vague. Are they worrying about sensitivity and specificity being related to how severe a case of a disease one has? Or that there’s not a constant sensitivity and specificity across time for a lab? There’s not consistent sensitivity and specificity for the person operating a machine within a lab either. As Keith pointed out, statistics is fractal and you can start with a simple model and then, with enough data, refine it. The problem is that without enough data, you can’t fit a model beyond something simple.

The paper Keith cited is from Sage and also paywalled.

13. Zhou Fang says:

> Then you just have to think through the numbers, which is really easy as I have illustrated above.

Well, you also have to make some assumptions about whether your friend can be regarded as being randomly sampled from the population distribution of people testing for covid19, or whether, for example, your friend is testing for stronger reasons than most of the other people being tested.

• Phil says:

I suppose it depends on what you want to get out of the analysis. In practice there are many complications involving (at least): the base rate isn’t known very precisely, the false positive rate isn’t known very precisely, and there’s additional information that isn’t included (e.g. my friend may differ in important ways from the other people being tested). If there were big implications in determining whether the “true probability” of a false positive is 60% vs 75% then yeah, it’s not so easy. Indeed it is not even easy to determine what one would mean by “true probability” in a case like this.

14. Alain says:

This example assumes that all the covid cases in Berkeley result in positive tests (i.e. sensitivity of 100%).
Assuming a sensitivity of 85% (typical of RT-PCR tests), you would get 23 positive tests (17 true pos. + 5 false pos.)
and the chance that your friend is a false positive would be 5/23, which reinforces your point.