A comparative genomic analysis of diverse clonal types of enterotoxigenic Escherichia coli reveals pathovar-specific conservation

Infect Immun. 2011 Feb;79(2):950-60. doi: 10.1128/IAI.00932-10. Epub 2010 Nov 15.

Abstract

Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrheal illness in children less than 5 years of age in low- and middle-income nations, whereas it is an emerging enteric pathogen in industrialized nations. Despite being an important cause of diarrhea, little is known about the genomic composition of ETEC. To address this, we sequenced the genomes of five ETEC isolates obtained from children in Guinea-Bissau with diarrhea. These five isolates represent distinct and globally dominant ETEC clonal groups. Comparative genomic analyses utilizing a gene-independent whole-genome alignment method demonstrated that sequenced ETEC strains share approximately 2.7 million bases of genomic sequence. Phylogenetic analysis of this "core genome" confirmed the diverse history of the ETEC pathovar and provides a finer resolution of the E. coli relationships than multilocus sequence typing. No identified genomic regions were conserved exclusively in all ETEC genomes; however, we identified more genomic content conserved among ETEC genomes than among non-ETEC E. coli genomes, suggesting that ETEC isolates share a genomic core. Comparisons of known virulence and of surface-exposed and colonization factor genes across all sequenced ETEC genomes not only identified variability but also indicated that some antigens are restricted to the ETEC pathovar. Overall, the generation of these five genome sequences, in addition to the two previously generated ETEC genomes, highlights the genomic diversity of ETEC. These studies increase our understanding of ETEC evolution, as well as provide insight into virulence factors and conserved proteins, which may be targets for vaccine development.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Child
  • Conserved Sequence
  • Enterotoxigenic Escherichia coli / classification*
  • Enterotoxigenic Escherichia coli / genetics*
  • Escherichia coli Infections / epidemiology
  • Escherichia coli Infections / microbiology
  • Escherichia coli Proteins / genetics
  • Escherichia coli Proteins / metabolism
  • Gene Expression Regulation, Bacterial / physiology
  • Genetic Variation
  • Genome, Bacterial*
  • Genomics / methods*
  • Guinea-Bissau / epidemiology
  • Humans
  • Membrane Glycoproteins / genetics
  • Membrane Glycoproteins / metabolism
  • Molecular Sequence Data
  • Multilocus Sequence Typing
  • Phylogeny

Substances

  • Escherichia coli Proteins
  • EtpA protein, E coli
  • Membrane Glycoproteins

Associated data

  • GENBANK/AELA00000000
  • GENBANK/AELB00000000
  • GENBANK/AELC00000000
  • GENBANK/AELD00000000
  • GENBANK/AELE00000000