Regional and temporal coordinated mutation patterns in SARS-CoV-2 spike protein revealed by a clustering and network analysis

Sci Rep. 2022 Jan 21;12(1):1128. doi: 10.1038/s41598-022-04950-4.

Abstract

SARS-CoV-2 has steadily mutated during its spread to > 300 million people throughout the world. The WHO has designated strains with certain mutations, "variants of concern" (VOC), as they may have higher infectivity and/or resist neutralization by antibodies in sera of vaccinated individuals and convalescent patients. Methods to detect regionally emerging VOC are needed to guide treatment and vaccine design. Cluster and network analysis was applied to over 1.2 million sequences of the SARS-CoV-2 spike protein from 36 countries in the GISAID database. While some mutations rapidly spread throughout the world, regionally specific groups of variants were identified. Strains circulating in each country contained different sets of high frequency mutations, many of which were known VOCs. Mutations within clusters increased in frequency simultaneously. Low frequency, but highly correlated mutations detected by the method could signal emerging VOCs, especially if they occur at higher frequency in other regions. An automated version of our method to find high frequency mutations in a set of SARS-COV-2 spike sequences is available online at http://curie.utmb.edu/SAR.html .

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • COVID-19 / genetics*
  • Humans
  • Mutation*
  • SARS-CoV-2 / genetics*
  • Spike Glycoprotein, Coronavirus / genetics*

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2