Large-scale analysis of SARS-CoV-2 spike-glycoprotein mutants demonstrates the need for continuous screening of virus isolates

PLoS One. 2021 Sep 27;16(9):e0249254. doi: 10.1371/journal.pone.0249254. eCollection 2021.

Abstract

Due to the widespread of the COVID-19 pandemic, the SARS-CoV-2 genome is evolving in diverse human populations. Several studies already reported different strains and an increase in the mutation rate. Particularly, mutations in SARS-CoV-2 spike-glycoprotein are of great interest as it mediates infection in human and recently approved mRNA vaccines are designed to induce immune responses against it. We analyzed 1,036,030 SARS-CoV-2 genome assemblies and 30,806 NGS datasets from GISAID and European Nucleotide Archive (ENA) focusing on non-synonymous mutations in the spike protein. Only around 2.5% of the samples contained the wild-type spike protein with no variation from the reference. Among the spike protein mutants, we confirmed a low mutation rate exhibiting less than 10 non-synonymous mutations in 99.6% of the analyzed sequences, but the mean and median number of spike protein mutations per sample increased over time. 5,472 distinct variants were found in total. The majority of the observed variants were recurrent, but only 21 and 14 recurrent variants were found in at least 1% of the mutant genome assemblies and NGS samples, respectively. Further, we found high-confidence subclonal variants in about 2.6% of the NGS data sets with mutant spike protein, which might indicate co-infection with various SARS-CoV-2 strains and/or intra-host evolution. Lastly, some variants might have an effect on antibody binding or T-cell recognition. These findings demonstrate the continuous importance of monitoring SARS-CoV-2 sequences for an early detection of variants that require adaptations in preventive and therapeutic strategies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antibodies / immunology
  • COVID-19 / prevention & control
  • COVID-19 / transmission
  • COVID-19 / virology*
  • Genome, Viral*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mutation Rate
  • Mutation*
  • Pandemics
  • Protein Domains
  • SARS-CoV-2 / chemistry
  • SARS-CoV-2 / genetics*
  • Spike Glycoprotein, Coronavirus / chemistry
  • Spike Glycoprotein, Coronavirus / genetics*
  • T-Lymphocytes / immunology

Substances

  • Antibodies
  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2

Grants and funding

The study supported by BioNTech SE, Mainz, Germany. The funder provided support in the form of salary for author U.S., but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of this author is articulated in the ‘author contributions’ section. In addition, the other authors are employees of the non-profit company TRON gGmbH and are supported in form of salary. TRON gGmbH did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.