Abstract
Based on a five-letter model of the 20 amino acids, we propose a new 2-D graphical representation of protein sequence. Then we transform the 2-D graphical representation into a numerical characterization that will facilitate quantitative comparisons of protein sequences. As an application, we construct the phylogenetic tree of 56 coronavirus spike proteins. The resulting tree agrees well with the established taxonomic groups.
Publication types
-
Research Support, Non-U.S. Gov't
MeSH terms
-
Amino Acid Sequence
-
Coronavirus / chemistry*
-
Coronavirus / genetics*
-
Membrane Glycoproteins / chemistry
-
Molecular Sequence Data
-
Phylogeny*
-
Sequence Analysis, Protein / methods*
-
Spike Glycoprotein, Coronavirus
-
Viral Envelope Proteins / chemistry
-
Viral Proteins / chemistry*
Substances
-
Membrane Glycoproteins
-
Spike Glycoprotein, Coronavirus
-
Viral Envelope Proteins
-
Viral Proteins