Predicting transcription factor binding sites using local over-representation and comparative genomics

BMC Bioinformatics. 2006 Aug 31:7:396. doi: 10.1186/1471-2105-7-396.

Abstract

Background: Identifying cis-regulatory elements is crucial to understanding gene expression, which highlights the importance of the computational detection of overrepresented transcription factor binding sites (TFBSs) in coexpressed or coregulated genes. However, this is a challenging problem, especially when considering higher eukaryotic organisms.

Results: We have developed a method, named TFM-Explorer, that searches for locally overrepresented TFBSs in a set of coregulated genes, which are modeled by profiles provided by a database of position weight matrices. The novelty of the method is that it takes advantage of spatial conservation in the sequence and supports multiple species. The efficiency of the underlying algorithm and its robustness to noise allow weak regulatory signals to be detected in large heterogeneous data sets.

Conclusion: TFM-Explorer provides an efficient way to predict TFBS overrepresentation in related sequences. Promising results were obtained in a variety of examples in human, mouse, and rat genomes. The software is publicly available at http://bioinfo.lifl.fr/TFM-Explorer.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Binding Sites
  • Chromosome Mapping / methods*
  • Genomics / methods
  • Molecular Sequence Data
  • Protein Binding
  • Regulatory Elements, Transcriptional / genetics
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Software*
  • Species Specificity
  • Transcription Factors / genetics*
  • Transcription, Genetic / genetics*

Substances

  • Transcription Factors