Characterizations of SARS-CoV-2 mutational profile, spike protein stability and viral transmission

Infect Genet Evol. 2020 Nov:85:104445. doi: 10.1016/j.meegid.2020.104445. Epub 2020 Jun 30.

Abstract

The recent pandemic of SARS-CoV-2 infection has affected more than 3.0 million people worldwide with more than 200 thousand reported deaths. The SARS-CoV-2 genome has the capability of gaining rapid mutations as the virus spreads. Whole-genome sequencing data offers a wide range of opportunities to study mutation dynamics. The advantage of an increasing amount of whole-genome sequence data of SARS-CoV-2 intrigued us to explore the mutation profile across the genome, to check the genome diversity, and to investigate the implications of those mutations in protein stability and viral transmission. We have identified frequently mutated residues by aligning ~660 SARS-CoV-2 genomes and validated in 10,000 datasets available in GISAID Nextstrain. We further evaluated the potential of these frequently mutated residues in protein structure stability of spike glycoprotein and their possible functional consequences in other proteins. Among the 11 genes, surface glycoprotein, nucleocapsid, ORF1ab, and ORF8 showed frequent mutations, while envelop, membrane, ORF6, ORF7a and ORF7b showed conservation in terms of amino acid substitutions. Combined analysis with the frequently mutated residues identified 20 viral variants, among which 12 specific combinations comprised more than 97% of the isolates considered for the analysis. Some of the mutations across different proteins showed co-occurrences, suggesting their structural and/or functional interaction among different SARS-COV-2 proteins, and their involvement in adaptability and viral transmission. Analysis of protein structure stability of surface glycoprotein mutants indicated the viability of specific variants and are more prone to be temporally and spatially distributed across the globe. A similar empirical analysis of other proteins indicated the existence of important functional implications of several variants. Identification of frequently mutated variants among COVID-19 patients might be useful for better clinical management, contact tracing, and containment of the disease.

Keywords: Frequent mutation; Hot-spot mutations; Protein stability; SARS-CoV-2; Spike glycoprotein.

MeSH terms

  • Humans
  • Models, Molecular
  • Mutation*
  • Phylogeny
  • Protein Conformation
  • Protein Domains
  • SARS-CoV-2 / genetics*
  • Sequence Alignment
  • Spike Glycoprotein, Coronavirus / chemistry*
  • Spike Glycoprotein, Coronavirus / genetics
  • Whole Genome Sequencing

Substances

  • Spike Glycoprotein, Coronavirus