Prediction of single-cell gene expression for transcription factor analysis

Gigascience. 2020 Oct 30;9(11):giaa113. doi: 10.1093/gigascience/giaa113.

Abstract

Background: Single-cell RNA sequencing is a powerful technology to discover new cell types and study biological processes in complex biological samples. A current challenge is to predict transcription factor (TF) regulation from single-cell RNA data.

Results: Here, we propose a novel approach for predicting gene expression at the single-cell level using cis-regulatory motifs, as well as epigenetic features. We designed a tree-guided multi-task learning framework that considers each cell as a task. Through this framework we were able to explain the single-cell gene expression values using either TF binding affinities or TF ChIP-seq data measured at specific genomic regions. TFs identified using these models could be validated by the literature.

Conclusion: Our proposed method allows us to identify distinct TFs that show cell type-specific regulation. This approach is not limited to TFs but can use any type of data that can potentially be used in explaining gene expression at the single-cell level to study factors that drive differentiation or show abnormal regulation in disease. The implementation of our workflow can be accessed under an MIT license via https://github.com/SchulzLab/Triangulate.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Gene Expression
  • Gene Expression Regulation*
  • Protein Binding
  • Transcription Factors* / genetics
  • Transcription Factors* / metabolism

Substances

  • Transcription Factors