Human immunodeficiency virus type 1 subtype C molecular phylogeny: consensus sequence for an AIDS vaccine design?

J Virol. 2002 Jun;76(11):5435-51. doi: 10.1128/jvi.76.11.5435-5451.2002.

Abstract

An evolving dominance of human immunodeficiency virus type 1 subtype C (HIV-1C) in the AIDS epidemic has been associated with a high prevalence of HIV-1C infection in the southern African countries and with an expanding epidemic in India and China. Understanding the molecular phylogeny and genetic diversity of HIV-1C viruses may be important for the design and evaluation of an HIV vaccine for ultimate use in the developing world. In this study we analyzed the phylogenetic relationships (i) between 73 non-recombinant HIV-1C near-full-length genome sequences, including 51 isolates from Botswana; (ii) between HIV-1C consensus sequences that represent different geographic subsets; and (iii) between specific isolates and consensus sequences. Based on the phylogenetic analyses of 73 near-full-length genomes, 16 "lineages" (a term that is used hereafter for discussion purposes and does not imply taxonomic standing) were identified within HIV-1C. The lineages were supported by high bootstrap values in maximum-parsimony and neighbor-joining analyses and were confirmed by the maximum-likelihood method. The nucleotide diversity between the 73 HIV-1C isolates (mean value of 8.93%; range, 2.9 to 11.7%) was significantly higher than the diversity of the samples to the consensus sequence (mean value of 4.86%; range, 3.3 to 7.2%, P < 0.0001). The translated amino acid distances to the consensus sequence were significantly lower than distances between samples within all HIV-1C proteins. The consensus sequences of HIV-1C proteins accompanied by amino acid frequencies were presented (that of Gag is presented in this work; those of Pol, Vif, Vpr, Tat, Rev, Vpu, Env, and Nef are presented elsewhere [http://www.aids.harvard.edu/lab_research/concensus_sequence.htm]). Additionally, in the promoter region three NF-kappa B sites (GGGRNNYYCC) were identified within the consensus sequences of the entire set or any subset of HIV-1C isolates. This study suggests that the consensus sequence approach could overcome the high genetic diversity of HIV-1C and facilitate an AIDS vaccine design, particularly if the assumption that an HIV-1C antigen with a more extensive match to the circulating viruses is likely to be more efficacious is proven in efficacy trials.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • AIDS Vaccines / genetics*
  • Acquired Immunodeficiency Syndrome / immunology
  • Amino Acid Sequence
  • Base Sequence
  • Consensus Sequence*
  • DNA, Viral
  • Drug Design
  • Gene Products, gag / genetics
  • HIV Infections / epidemiology
  • HIV Infections / virology*
  • HIV Long Terminal Repeat
  • HIV-1 / classification
  • HIV-1 / genetics*
  • HIV-1 / isolation & purification
  • Humans
  • Molecular Sequence Data
  • Phylogeny

Substances

  • AIDS Vaccines
  • DNA, Viral
  • Gene Products, gag

Associated data

  • GENBANK/AF443074
  • GENBANK/AF443115