The diversity of HIV-1 and its propensity to generate escape mutants present fundamental challenges to control efforts, including HIV vaccine design. Intra-host diversification of HIV is determined by immune responses elicited by an HIV-infected individual over the course of the infection. Complex and dynamic patterns of transmission of HIV lead to an even more complex population viral diversity over time, thus presenting enormous challenges to vaccine development. To address inter-patient viral evolution over time, a set of 653 unique HIV-1 subtype C gag sequences were retrieved from the LANL HIV Database, grouped by sampling year as <2000, 2000, 2001-2002, 2003, and 2004-2006, and analyzed for the site-specific frequency of translated amino acid residues. Phylogenetic analysis revealed that a total of 289 out of 653 (44.3%) analyzed sequences were found within 16 clusters defined by aLRT of more than 0.90. Median (IQR) inter-sample diversity of analyzed gag sequences was 8.7% (7.7%; 9.8%). Despite the heterogeneous origins of analyzed sequences, the gamut and frequency of amino acid residues in wild-type Gag were remarkably stable over the last decade of the HIV-1 subtype C epidemic. The vast majority of amino acid residues demonstrated minor frequency fluctuation over time, consistent with the conservative nature of the HIV-1 Gag protein. Only 4.0% (20 out of 500; HXB2 numbering) amino acid residues across Gag displayed both statistically significant (p<0.05 by both a trend test and heterogeneity test) changes in amino acid frequency over time as well as a range of at least 10% in the frequency of the major amino acid. A total of 59.2% of amino acid residues with changing frequency of 10%+ were found within previously identified CTL epitopes. The time of the most recent common ancestor of the HIV-1 subtype C was dated to around 1950 (95% HPD from 1928 to 1962). This study provides evidence for the overall stability of HIV-1 subtype C Gag among viruses circulating in the epidemic over the last decade. However selected sites across HIV-1C Gag with changing amino acid frequency are likely to be under selection pressure at the population level.
Keywords: CTL epitopes; Gag; HIV-1 subtype C; amino acid frequency; consensus sequence; gag phylogeny; time of MRCA.