Why JN.1 Variant Is Over-Represented in Wastewater Covid Surveillance

It’s now about the silent mutations with Pirola descendants.

Milton Simba Kambarami
Microbial Instincts

--

The illustration depicts the high rate of negative or purifying selection in the SARS-CoV-2 S protein. Sites with values below 0 on the y-axis (beta-alpha) indicate a greater likelihood of negative selection, which is evident in almost every site of the S protein. The article focuses on site 686, which is highlighted in the illustration.

In one of my articles 4 months ago I mentioned that Pirola though not a threat yet was going to be an evolutionarily successful variant through its descendants. Currently, JN.1 is the most superspreading COVID-19 variant.

Today, I want to talk about synonymous mutations and their role in sequence analysis. While positive or adaptive selection often receives the most attention in evolutionary biology, it’s important to note that synonymous mutations can also play a significant role in how viruses evolve to adapt to new environments.

Last week, I analyzed JN.1 S proteins to better understand their focal point of evolution. The results were surprising — JN.1 exhibited more purifying selection rates than adaptive selection rates. This was the first time I had observed this in SARS-CoV-2 sequence analysis.

I advocate for science explained in simple English. Before I continue, I would like to provide some background information about synonymous mutations. However, if you are only interested in obtaining information and not concerned about the scientific and mathematical details, you can skip ahead to THE GIST section.

Background

How Synonymous Substitutions Affect RNA Tertiary Structure and Function

RNA molecules are essential for life, as they perform various roles in gene expression, regulation, and processing. RNA molecules have complex three-dimensional shapes, called tertiary structures, that are determined by their primary sequences and the interactions between their nucleotides.

RNA tertiary structures are important for the function of many RNA molecules, such as ribosomal RNA, transfer RNA, messenger RNA, and non-coding RNA. RNA tertiary structures can also influence the translation of mRNA into protein, by affecting the speed and accuracy of the ribosome.

However, RNA sequences are not static, but can change over time due to mutations. Mutations are changes in the nucleotide sequence that can occur randomly or due to environmental factors, such as radiation, chemicals, or viruses. Mutations can have different effects on the protein and RNA molecules, depending on the type and location of the mutation.

One type of mutation is synonymous substitution, which is a change in the nucleotide sequence that does not alter the amino acid sequence of the encoded protein.

For example, the codon UUU can be changed to UUC, but both codons encode the same amino acid, phenylalanine. Synonymous substitutions are also called silent mutations, as they are thought to be neutral or have no effect on the protein function.

However, recent studies have shown that synonymous substitutions can still affect the folding, stability, function, and interactions of the RNA molecule, as well as its translation efficiency and accuracy.

This is because synonymous substitutions can alter the codon usage and translation efficiency of the mRNA, the stability and interactions of the mRNA, and the conformational ensemble and folding pathway of the mRNA.

These effects can have implications for the fitness and phenotype of the organism, as well as the evolution and divergence of the RNA and protein molecules.

Codon Usage and Translation Efficiency

Codons are the units of genetic code that specify the amino acids in the protein sequence. There are 64 possible codons, but only 20 amino acids, so some amino acids are encoded by more than one codon.

For example, the amino acid leucine is encoded by six codons: UUA, UUG, CUU, CUC, CUA, and CUG. These codons are called synonymous codons, as they encode the same amino acid.

However, not all synonymous codons are used equally in different organisms, genes, or regions of mRNA. Some codons are more preferred or optimal than others, meaning that they are recognized faster and more accurately by the tRNA.

The preference or usage of different codons is called codon bias, and it can vary depending on the organism, gene, or region of mRNA.

Synonymous substitutions can change the codon usage or bias of a region of mRNA, which can affect protein synthesis and folding. For example, some synonymous substitutions can introduce or remove optimal or rare codons, which can affect the rate and accuracy of translation, as well as the co-translational folding of the protein.

Co-translational folding is the process by which the protein begins to fold as it is being synthesized by the ribosome. The speed and accuracy of translation can affect the co-translational folding, as different codons can cause pauses, errors, or misfolding in the protein synthesis and folding.

Stability and Interactions

RNA molecules are not linear, but can fold into complex shapes by forming base pairs and stacking interactions between their nucleotides. Base pairs are the hydrogen bonds that form between complementary nucleotides, such as A and U, or G and C. Stacking interactions are the attractive forces that occur between adjacent base pairs, which stabilize the RNA structure.

The base pairing and stacking interactions determine the secondary and tertiary structure of the RNA molecule, which can affect its stability and degradation, as well as its interactions with other molecules, such as ribosomes, proteins, and non-coding RNAs.

Synonymous substitutions can affect the secondary and tertiary structure of the mRNA, by changing the base pairing and stacking interactions between the nucleotides. For example, some synonymous substitutions can increase the G:C content of the mRNA, which can increase the number of potential G:C base pairs, which are stronger than A:U interactions. This can increase the stability of the mRNA, but also affect its folding and accessibility.

Conformational Ensemble and Folding PathwayRNA molecules are not rigid, but can adopt different shapes or conformations depending on the environmental conditions, such as temperature, pH, or salt concentration.

The set of possible shapes that the RNA can adopt is called the conformational ensemble, which is determined by the energy landscape and the transition states of the folding process.

The energy landscape is the representation of the potential energy of the RNA molecule as a function of its conformation. The transition states are the intermediate conformations that the RNA molecule passes through as it folds from one shape to another.

The conformational ensemble and the folding pathway can affect the dynamics and kinetics of the RNA folding, as well as the availability of functional conformations.

Synonymous substitutions can alter the distribution of possible shapes that the mRNA can adopt, by changing the energy landscape and the transition states of the folding process. For example, some synonymous substitutions can introduce or remove structural elements, such as hairpins, loops, or pseudoknots, which can affect the folding pathway and the interactions with other factors.

Back to the Corona

The (beta-alpha) value (check graph) is a measurement of positive selection, also known as adaptive selection, versus purifying selection, which is also known as negative selection, in the evolution of DNA sequences.

Positive selection occurs when a mutation increases the fitness of an organism and is favored by natural selection, whereas purifying selection occurs when a mutation decreases the fitness of an organism and is eliminated by natural selection.

The (beta-alpha) value is calculated by comparing the rates of non-synonymous substitutions (dN) and synonymous substitutions (dS) between two or more sequences. Non-synonymous substitutions refer to changes in the DNA that alter the amino acid sequence of the encoded protein, while synonymous substitutions refer to changes in the DNA that do not alter the amino acid sequence of the encoded protein.

The (beta-alpha) value is defined as:

(beta-alpha) = (dN/dS) — 1

The (beta-alpha) value is a measure that can be used to infer the evolutionary history and functional importance of genes and proteins, as well as the selective pressures acting on them.

A positive value indicates that positive selection is stronger than purifying selection, meaning that there are more non-synonymous substitutions than expected by chance.

On the other hand, a negative value indicates that purifying selection is stronger than positive selection, meaning that there are fewer non-synonymous substitutions than expected by chance. A zero value indicates that positive and purifying selection are balanced, meaning that the rate of non-synonymous substitutions is equal to the rate of synonymous substitutions.

There are different methods to estimate the (beta-alpha) value from DNA data, such as the McDonald-Kreitman test, the site model, the branch model, and the branch-site model. These methods use different assumptions and statistical tests to compare the observed and expected numbers of non-synonymous and synonymous substitutions in different regions of the genome, different branches of the phylogenetic tree, or different sites of the protein.

Purifying selection, also known as negative selection or background selection, is a process of natural selection that removes harmful genetic variants from a population.

This process can help maintain the stability and functionality of genes and proteins that are essential for the survival and reproduction of organisms. Purifying selection can be detected by comparing the rates of changes in DNA sequences that affect the amino acid sequence of proteins (non-synonymous changes) and those that do not (synonymous changes).

If purifying selection is strong, there will be fewer non-synonymous changes than expected by chance, indicating that most of them are deleterious and eliminated by natural selection.

Additionally, purifying selection can reduce the genetic diversity in regions of the genome that are linked to deleterious variants, because they are also removed by chance. This effect is called background selection.

The SARS-CoV-2 S protein site 686 is a part of the extended loop that harbors the S1/S2 cleavage site of the spike protein. The S1/S2 cleavage site is a determinant of SARS-CoV-2 cell tropism and pathogenicity, as it allows the separation of the S1 subunit, which contains the receptor-binding domain (RBD), from the S2 subunit, which mediates membrane fusion.

The S1/S2 cleavage site is recognized and cleaved by host proteases, such as furin and TMPRSS2, which activate the spike protein for entry into host cells. The SARS-CoV-2 S protein site 678 is a serine residue (S) that can be mutated to a glycine residue (G) in some viral variants.

According to a recent study, the S protein 686 mutation decreases the S protein cleavage at the S1/S2 site, but does not affect the ACE2 binding and cell-cell fusion. However, the S protein 686 mutation modulates the efficiency of host cell entry in a cell-type-dependent manner, which may be related to the availability of cathepsin L for S protein activation. Therefore, the SARS-CoV-2 S protein site 686 may play a role in the viral infectivity and adaptation to different host cells.

The GIST

SARS-CoV-2 has been evolving to enter more than just the respiratory cells. This is because of the synonymous mutation in site 686 that is giving an added advantage to the Pirola descendant, JN.1 variant.

How, you ask? S protein site 686 is responsible for host cell tropism, this means it necessitates invasion of enteric or digestive cells on top of respiratory cells.

Up to now, it has been all Science and it didn’t make sense how it connects to the world, right.. OK there has been a rise in wastewater JN.1 detection. Why?

The recent increase in JN.1 infections may be attributed to its enhanced ability to infect a wider range of host cells, such as those in the gastrointestinal tract.

The term ‘enhanced’ is used to indicate that JN.1 is not unique in targeting gastrointestinal cells, since all SARS-CoV-2 variants can bind to ACE2 receptors that are expressed on various cell types, including those in the respiratory and digestive systems.

However, prior variants were more likely to cause severe respiratory symptoms, whereas JN.1 may have a different clinical presentation. It’s not like SARS-CoV-2 is causing diarrhea only but it has more ways of shedding itself besides the respiratory pathway.

Viruses act like humans, if I were to put Homo sapiens on the Tree of Life, I would put us together with Viruses, we are more similar than we think we are.

--

--