Automatic Cell Type Annotation Using Marker Genes for Single-Cell RNA Sequencing Data

Biomolecules. 2022 Oct 21;12(10):1539. doi: 10.3390/biom12101539.

Abstract

Recent advancement in single-cell RNA sequencing (scRNA-seq) technology is gaining more and more attention. Cell type annotation plays an essential role in scRNA-seq data analysis. Several computational methods have been proposed for automatic annotation. Traditional cell type annotation is to first cluster the cells using unsupervised learning methods based on the gene expression profiles, then to label the clusters using the aggregated cluster-level expression profiles and the marker genes' information. Such procedure relies heavily on the clustering results. As the purity of clusters cannot be guaranteed, false detection of cluster features may lead to wrong annotations. In this paper, we improve this procedure and propose an Automatic Cell type Annotation Method (ACAM). ACAM delineates a clear framework to conduct automatic cell annotation through representative cluster identification, representative cluster annotation using marker genes, and the remaining cells' classification. Experiments on seven real datasets show the better performance of ACAM compared to six well-known cell type annotation methods.

Keywords: cell type annotation; marker genes; scRNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers
  • Cluster Analysis
  • Data Analysis
  • Gene Expression Profiling / methods
  • RNA
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Transcriptome*

Substances

  • Biomarkers
  • RNA

Grants and funding

S. Zhang’s research is supported in part by Science and Technology Commission of Shanghai Municipality No. 20ZR1407700.