Integrating single-cell datasets with ambiguous batch information by incorporating molecular network features

Brief Bioinform. 2022 Jan 17;23(1):bbab366. doi: 10.1093/bib/bbab366.

Abstract

With the rapid development of single-cell sequencing techniques, several large-scale cell atlas projects have been launched across the world. However, it is still challenging to integrate single-cell RNA-seq (scRNA-seq) datasets with diverse tissue sources, developmental stages and/or few overlaps, due to the ambiguity in determining the batch information, which is particularly important for current batch-effect correction methods. Here, we present SCORE, a simple network-based integration methodology, which incorporates curated molecular network features to infer cellular states and generate a unified workflow for integrating scRNA-seq datasets. Validating on real single-cell datasets, we showed that regardless of batch information, SCORE outperforms existing methods in accuracy, robustness, scalability and data integration.

Keywords: data integration; molecular network; protein–protein interaction; single-cell RNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Exome Sequencing
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods