Millions of coronavirus mutations offer a new insight into evolution
Using publicly available SARS-CoV-2 sequences, researchers have revealed the genetic sites that must be in a particular state for the coronavirus to survive and which can tolerate changes
7 February 2023
The genome sequences of millions of individual SARS-CoV-2 viruses have enabled researchers to study evolution in a way that wasn’t possible before. The coronavirus’s global proliferation means that we have a sufficient amount of sequence data to track every possible mutation that affects a single letter of its RNA and the impact these have on the pathogen’s fitness.
The findings could help us develop drugs that target parts of the virus’s proteins that can’t easily mutate to evade existing drugs, says Jesse Bloom at the Fred Hutchinson Cancer Center in Seattle, Washington.
By monitoring the growth of coronavirus variants, it is possible to identify some of the single-letter mutations that confer an advantage for the virus. The letters relate to four bases that make up part of the virus’s genetic backbone. But these single-letter mutations are just a tiny fraction of all possible mutations.
What Bloom and Richard Neher at the University of Basel, Switzerland, realised is that because SARS-CoV-2 has proliferated so greatly in the ongoing global pandemic, every possible single-letter RNA mutation has happened 15,000 times on average. What’s more, the millions of sequenced samples give us a way to assess the results of these natural experiments.
Using millions of publicly available SARS-CoV-2 sequences, the pair first counted how often mutations had occurred in sites where all single-letter mutations are known to be neutral, because they don’t result in any change in protein sequence.
This told them how many mutations would occur in any site without affecting viral fitness.
They then compared the number of observed mutations per site to this expected number. If the number of observed mutations is lower, then viruses with a specific mutation are more likely to die out and all such mutations are harmful. If it is higher, all these mutations must be beneficial.
Bloom and Neher then mapped the results onto all the SARS-CoV-2 proteins to reveal which sites must be in a particular state for the virus to be successful and which sites can tolerate changes.
This could be applied to any organism that exists in sufficient numbers for every single-letter mutation to have occurred multiple times, and for which we have enough sequence data.
“A species such as tigers or elephants doesn’t have enough living individuals,” says Bloom.
But in 2015, Jay Shendure at the University of Washington in Seattle pointed out that there are enough people that we could observe every mutation that doesn’t affect our survival if we sequenced a large proportion of the human population, which now stands at 8 billion.
“I think it’s less a question of ‘if’ than ‘when’,” says Shendure. We haven’t sequenced nearly enough human genomes yet, he says. “But if current trends continue, then it will happen eventually.”
Reference: bioRxiv, DOI: 10.1101/2023.01.30.526314
More on these topics: