Detection of RNA-DNA binding sites in long noncoding RNAs

Nucleic Acids Res. 2019 Apr 8;47(6):e32. doi: 10.1093/nar/gkz037.

Abstract

Long non-coding RNAs (lncRNAs) can act as scaffolds that promote the interaction of proteins, RNA, and DNA. There is increasing evidence of sequence-specific interactions of lncRNAs with DNA via triple-helix (triplex) formation. This process allows lncRNAs to recruit protein complexes to specific genomic regions and regulate gene expression. Here we propose a computational method called Triplex Domain Finder (TDF) to detect triplexes and characterize DNA-binding domains and DNA targets statistically. Case studies showed that this approach can detect the known domains of lncRNAs Fendrr, HOTAIR and MEG3. Moreover, we validated a novel DNA-binding domain in MEG3 by a genome-wide sequencing method. We used TDF to perform a systematic analysis of the triplex-forming potential of lncRNAs relevant to human cardiac differentiation. We demonstrated that the lncRNA with the highest triplex-forming potential, GATA6-AS, forms triple helices in the promoter of genes relevant to cardiac development. Moreover, down-regulation of GATA6-AS impairs GATA6 expression and cardiac development. These data indicate the unique ability of our computational tool to identify novel triplex-forming lncRNAs and their target genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Sequence
  • Binding Sites / genetics
  • Computational Biology / methods*
  • DNA / chemistry
  • DNA / metabolism*
  • Gene Expression
  • Humans
  • Nucleic Acid Conformation
  • Protein Binding
  • RNA, Long Noncoding / chemistry*
  • RNA, Long Noncoding / metabolism*
  • Transcription Factors / metabolism

Substances

  • HOTAIR long untranslated RNA, human
  • RNA, Long Noncoding
  • Transcription Factors
  • DNA