Nomenclature and numbering of the hepatitis C virus

Methods Mol Biol. 2009:510:33-53. doi: 10.1007/978-1-59745-394-3_4.

Abstract

International standardization and coordination of the nomenclature of variants of hepatitis C virus (HCV) is increasingly needed as more is discovered about the scale of HCV-related liver disease and important biological and antigenic differences that exist between variants. Consistency in numbering is also increasingly required for functional and clinical studies of HCV. For example, an unambiguous method for referring to amino acid substitutions at specific positions in NS3 and NS5B coding sequences associated with resistance to specific HCV inhibitors is essential in the investigation of antiviral treatment. Inconsistent and inaccurate numbering of locations in DNA and protein sequences is becoming a problem in the HCV scientific literature.A group of experts in the field of HCV genetic variability, and those involved in development of HCV sequence databases, the Hepatitis Virus Database (Japan), euHCVdb (France), and the Los Alamos National Laboratory (United States), convened to reexamine the status of HCV genotype nomenclature, resolve conflicting genotype or subtype names among described variants of HCV, and draw up revised criteria for the assignment of new genotypes as they are discovered in the future. They also discussed how HCV sequence databases could introduce and facilitate a standardized numbering system for HCV nucleotides, proteins, and epitopes.A comprehensive listing of all currently classified variants of HCV incorporates a number of agreed genotype and subtype name reassignments to create consistency in nomenclature. A consensus proposal was drawn up for the classification of new variants into genotypes and subtypes, which recognizes and incorporates new knowledge of HCV genetic diversity and epidemiology. The proposed numbering system was adapted from the Los Alamos HIV database, with elements from the hepatitis B virus numbering system. The system comprises both nucleotides and amino acid sequences and epitopes, and uses the full-length genome sequence of isolate H77 (accession number AF009606) as a reference. It includes a method for numbering insertions and deletions relative to this reference sequence.

MeSH terms

  • 3' Untranslated Regions / genetics
  • 5' Untranslated Regions / genetics
  • Amino Acid Sequence
  • Base Sequence
  • Databases, Genetic
  • Genome, Viral
  • Genotype
  • Hepacivirus / chemistry
  • Hepacivirus / classification*
  • Hepacivirus / genetics
  • INDEL Mutation
  • Molecular Sequence Data
  • Recombination, Genetic
  • Terminology as Topic*

Substances

  • 3' Untranslated Regions
  • 5' Untranslated Regions