Is omicron natural or not – a probabilistic theory?

Aleks points us to this article, “A probabilistic approach to evaluate the likelihood of artificial genetic modification and its application to SARS-CoV-2 Omicron variant,” which begins:

A method to find a probability that a given bias of mutations occur naturally is proposed to test whether a newly detected virus is a product of natural evolution or artificial genetic modification. The probability is calculated based on the neutral theory of molecular evolution and binominal distribution of non-synonymous (N) and synonymous (S) mutations. Though most of the conventional analyses, including dN/dS analysis, assume that any kinds of point mutations from a nucleotide to another nucleotide occurs with the same probability, the proposed model takes into account the bias in mutations, where the equilibrium of mutations is considered to estimate the probability of each mutation. The proposed method is applied to evaluate whether the Omicron variant strain of SARS-CoV-2, whose spike protein includes 29 N mutations and only one S mutation, can emerge through natural evolution. The result of binomial test based on the proposed model shows that the bias of N/S mutations in the Omicron spike can occur with a probability of 1.6 x 10^(-3) or less. Even with the conventional model where the probabilities of any kinds of mutations are all equal, the strong N/S mutation bias in the Omicron spike can occur with a probability of 3.7 x 10^(-3), which means that the Omicron variant is highly likely a product of artificial genetic modification.

I don’t know anything about the substance. The above bit makes me suspicious, as it looks like what they’re doing is rejecting null hypothesis A and using this to claim that their favored alternative hypothesis B is true.

Further comments from an actual expert are here.

12 thoughts on “Is omicron natural or not – a probabilistic theory?

  1. I know nothing about the substance either. But I do think this is a good example of what frustrates me about Twitter so much. I find it difficult to follow any chain of thought, and I find it difficult to even identify clearly who is saying what in response to whom/what. Beyond that, the stated result does strike me the same as what Andrew said. It also sounds a lot like the “analyses” of the 2020 election that found that the probability of so many D votes coming in the mail in ballots was one in a zillion. Once you make an extreme assumption (e.g., that the votes should come in equally or that mutations should occur with equal probability), it then becomes easy to reject it, leading to concluding your favorite theory rather than simply finding that your extreme assumption was unwarranted to begin with.

  2. It might be useful to compare genetic variations in SARS-CoV-2 to genetic variants found in other infections rather than a mathematical model of mutations. RNA viruses often produce genetic variants during their duplication process, and each variant is subjected to pressure from the host immune system resulting in the destruction of most of them but the amplification of the more virulent ones. This is certainly observed in HIV. https://www.science.org/doi/10.1126/science.abk1688
    Similar effects of unstable replication of a RNA virus interacting with the host immune system is seen in Hepatitis C.
    Real infections in the real world may act differently than an abstract mathematical model suggests.

    • “each variant is subjected to pressure from the host immune system resulting in the destruction of most of them but the amplification of the more virulent ones.”
      Virulence has to be optimal where the most virulent strains die off with the host. During the first months of the pandemic there were very deadly strains that died with hosts during a very fast evolutionary pressure. There were reports of very healthy young adults dropping like flies in Wuhan, but only in the very beginning. We are talking days and weeks perhaps. Those strains were probably never seen again.
      My point is that the most virulent strains cannot go too far and with time the less virulent ones (or the ones with a delayed action) will survive and thrive.

    • “This is certainly observed in HIV….Similar effects..in Hepatitis C.”

      But HIV and HepC are not respiratory viruses. They transfer far less rapidly through a population and take much longer to manifest symptoms.

      The more appropriate model for COVID, as for example the Mount Vernon WA case study indicated, is the 1918 flu pandemic. Navigator describes people dropping dead nearly instantly from COVID in the early stages of the pandemic in Wuhan. The same thing happened in the 1918 pandemic in the early stages in the US. In some cases nearly entire army barracks died within a few days of infection.

      This book describes the events of the 1918 pandemic in 19 hours of gory detail.

      • I looked at a couple of articles regarding viral load in influenza A and RSV. They reported 10,000 to 100,000,000 virons per nasal swab. To me this means that each infected person produces millions/tens of millions virons during the course of their illness. The inherent instability of RNA virus replication makes me think that some weird RNA sequences may well occur without the need for deliberate human action. If I throw a ball from half court I will make a three pointer if I get a million shots.

  3. I missed the place where the probability of a certain mutation occurring is multiplied by the number of chances for a mutation to occur. Maybe there was such a calculation, but it was not clear to me from the quote.

    I also recall a similar calculation about a different virus by Dr. Behe in which he claimed that there was only one mutational path, but in another paper at least seven were found.

  4. Here’s some of the background substance. A non-synonymous substitution in a protein-coding gene changes the protein’s structure; a synonymous substitution changes the nucleotide sequence of the gene, but not of the protein, so the protein is unchanged. (Synonymous mutations occur because of the degeneracy of the genetic code– the same constituent bit of a protein may be coded for by more than one bit of nucleotide sequence.) The dN/dS ratio, properly calculated, is the ratio of non-synonymous substitutions to synonymous substitutions in a protein-coding sequence. If the non-synonymous substitutions make no difference to the organism’s survival and reproduction, then the dN/dS ratio should be about 1; if it is, the changes are said to be “neutral”. If it’s less than 1, then the non-synonymous substitutions are “bad”, and are being removed by what is called “purifying selection”. If it’s greater than 1, then the non-synonymous substitutions are “good”, and are being favored by “positive selection”. (A recent modestly-accessible primer on the ratio is this recent paper by Alvarez-Carretero, Kapli, and Yang: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10127084/)

    Going only off the extract quoted by Andrew, the dN/dS ratio is 29; this is quite a bit more than 1. This indicates strong positive selection. It does not indicate that the selection was artificial, however– it could easily be natural selection. The statistical question is really one of estimation, not of hypothesis testing. Once you have evidence about the value of the ratio, there are loads of hypotheses that can be entertained about why the ratio is what it is– it’s not as though “natural evolution” and . “artificial genetic modification” map on to particular values or ranges of values of the ratio.

    It does seem rather Behe-esque, although the style of argument goes back to at least Arbuthnot.

    • So, any mutation imbalance is consistent with the standard viewpoint that Omicron was simply incubated for many months inside a chronically-sick human (of the sorts we can observe in sewage), which would look similar to serial passaging.

  5. It is an interesting observation (Omicron variant has 29 non-synonymous mutations and only 1 synonymous one). Clearly a variant with a significant phenotypic difference is going to have several, and maybe many, non-synonymous mutations (else it wouldn’t have a different phenotype!), but at first sight the very low number of synonomous ones is surprising. But as others have pointed out it’s illogical to infer that this in itself means that the Omicron variant was created in a lab!

    At least some (if not all) of the “surprising” nature of their result is probably due to their neglect of natural selection (“Second, we neglect the effect of natural selection.”). Obviously natural selection is going to ensure that newly escaped variants (e.g. with enhanced infectivity) have a significant number of non-synonymous mutations.

    Anyway, there’s probably something quite interesting going on that underlies the low number of synonymous mutations in Omicron.

    The paper also has a couple of red flags. The “political” stuff about which scientists thought this or that at various points in the progression of the pandemic; also seriously impressive displays of illogic such as:

    “Since intentional point mutation to change an amino acid to another amino acid to gain function is non-synonymous, the strong bias toward N mutation suggests that the spike protein of the Omicron variant is highly likely to have been manipulated in a laboratory…”

    which isn’t a large number of synonomous mutations from:

    “Since intentional use of matches to light a fire can result in substantial fire-related damage, the observation of fire-related damage suggests that someone intentionally used matches to light a fire.”

  6. You can passage a virus through human cells and select for variants with whatever property. If you are really cynical, you dont even need cell culture technology and can use prisoners or volunteers.

    I don’t think people realize how trivial it is to artificially create a virus that looks consistent with natural selection. Or that viruses escape from BSL-3/4 labs all the time. Like every few years that we know about. SARS-2 (covid) escaped from a lab studying it within two years of being discovered.

    https://www.science.org/content/article/taiwan-s-science-academy-fined-biosafety-lapses-after-lab-worker-contracts-covid-19

Leave a Reply

Your email address will not be published. Required fields are marked *