Using "Galaxy-rCASC": A Public Galaxy Instance for Single-Cell RNA-Seq Data Analysis

Methods Mol Biol. 2023:2584:311-335. doi: 10.1007/978-1-0716-2756-3_16.

Abstract

rCASC is a modular workflow providing an integrated environment for single-cell RNA-seq (scRNA-Seq) data analysis exploiting Docker containers to achieve functional and computational reproducibility. It was initially developed as an R package usable also through a Java GUI. However, the Java frontend cannot be employed when running rCASC on a remote server, a typical setup due to the significant computational resources commonly needed to analyze scRNA-Seq data.To allow the use of rCASC through a graphical user interface on the client side and to harness the many advantages provided by the Galaxy platform, we have made rCASC available as a Galaxy set of tools, also providing a dedicated public instance of Galaxy named "Galaxy-rCASC." To integrate rCASC into Galaxy, all its functions, originally implemented as a set of Docker containers to maximize reproducibility, have been extensively reworked to become independent from the R package functions that launch them in the original implementation. Furthermore, suitable Galaxy wrappers have been developed for most functions of rCASC. We provide a detailed reference document to the use of Galaxy-rCASC with insights and explanations on the platform functionalities, parameters, and output while guiding the reader through the typical rCASC analysis workflow of a scRNA-Seq dataset.

Keywords: Docker; Galaxy; Reproducibility; scRNA-Seq.

MeSH terms

  • Computational Biology
  • Data Analysis
  • Humans
  • Reproducibility of Results
  • Single-Cell Analysis
  • Single-Cell Gene Expression Analysis*
  • Software*
  • Workflow