Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions

Nucleic Acids Res. 2011 Apr;39(7):2492-502. doi: 10.1093/nar/gkq1081. Epub 2010 Nov 24.

Abstract

Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available literature and ChIP-seq and ChIP-chip data for enhanceosome factor p300 and the transcriptional regulator Gli3, we found that genomic proximity and conserved synteny predict target genes with a relatively low recall of 12-27% within 2 Mb intervals centered at the enhancers. Here, we show that functional similarities between enhancer binding proteins and their transcriptional targets and proximity in the protein-protein interactome improve prediction of target genes. We used all four features to train random forest classifiers that predict target genes with a recall of 58% in 2 Mb intervals that may contain dozens of genes, representing a better than two-fold improvement over the performance of prediction based on single features alone. Genome-wide ChIP data is still relatively poorly understood, and it remains difficult to assign biological significance to binding events. Our study represents a first step in integrating various genomic features in order to elucidate the genomic network of long-range regulatory interactions.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Chromatin Immunoprecipitation
  • DNA-Binding Proteins / metabolism
  • Enhancer Elements, Genetic*
  • Genomics / methods*
  • Kruppel-Like Transcription Factors / metabolism*
  • Mice
  • Nerve Tissue Proteins / metabolism*
  • Oligonucleotide Array Sequence Analysis
  • Protein Interaction Mapping / methods*
  • Synteny
  • Zinc Finger Protein Gli3

Substances

  • DNA-Binding Proteins
  • Gli3 protein, mouse
  • Kruppel-Like Transcription Factors
  • Nerve Tissue Proteins
  • Zinc Finger Protein Gli3