Background: Protozoan parasites improve the likelihood of invading or adapting to the host through their capacity to present a large repertoire of surface molecules. The understanding of the mechanisms underlying the generation of antigenic diversity is crucial to aid in the development of therapies and the study of evolution. Despite advances driven by molecular biology and genomics, there is a need to gain a deeper understanding of key properties that may facilitate variation generation, models for explaining the role of genomic re-arrangements and the characterisation of surface protein families on the basis of their capacity to generate variation. Computer models may be implemented to explore, visualise and estimate the variation generation capacity of gene families in a dynamic fashion. In this paper we report the dynamic simulation of genomic variation using real T. cruzi coding sequences as inputs to a computational simulation system. The effects of random, multiple-point mutations and gene conversions on genomic variation generation were quantitatively estimated and visualised. Simulations were also implemented to investigate the potential role of pseudogenes as a source of antigenic variation in T. cruzi.
Results: Computational models of variation generation were applied to real coding sequences from surface proteins in T. cruzi: trans-sialidase-like proteins and putative surface protein dispersed gene family-1. In the simulations the sequences self-replicated, mutated and re-arranged during thousands of generations. Simulations were implemented for different mutation rates to estimate the relative robustness of the protein families in the face of DNA multiple-point mutations and sequence re-arrangements. The gene super-families and families showed distinguishing evolutionary responses, which may be used to characterise them on the basis of their capacity to generate variability. The simulations showed that sequences from T. cruzi nuclear genes tend to be relatively more robust against random, multiple-point mutations than those obtained from surface protein genes. Simulations also showed that a gene conversion model may act as an effective variation generation mechanism. Differential variation responses can be used to characterise the sequence groups under study. For example, unlike other families, sequences from the DGF1 family have the capacity to maximise variation at the amino acid level under relatively low mutation rates and through gene conversion. However, in relation to the other protein families, they exhibit more robust behaviour in response to more severe modifications through intra-family genomic sequence exchange. Independent simulations indicate that DGF1 pseudogenes might play a role in the generation of greater genomic variation in the DFG1 gene family through gene conversion under different experimental conditions.
Conclusion: Digital, dynamic simulations may be implemented to characterise gene families on the basis of their capacity to generate variation in the face of genomic perturbations. Such simulations may be useful to explore antigenic variation mechanisms and hypotheses about robustness at the genomic level. This investigation illustrated how sequences derived from surface protein genes and computer simulations can be used to investigate variation generation mechanisms. Such in silico experiments of self-replicating sequences undergoing random mutations and genomic re-arrangements can offer insights into the diversity generation potential of the genes under study. Biologically-inspired simulations may support the study of genomic variation mechanisms in pathogens whose genomes have been recently sequenced.