AutoClassWeb: a simple web interface for Bayesian clustering of omics data

BMC Res Notes. 2022 Jul 7;15(1):241. doi: 10.1186/s13104-022-06129-6.

Abstract

Objective: Data clustering is a common exploration step in the omics era, notably in genomics and proteomics where many genes or proteins can be quantified from one or more experiments. Bayesian clustering is a powerful unsupervised algorithm that can classify several thousands of genes or proteins. AutoClass C, its original implementation, handles missing data, automatically determines the best number of clusters but is not user-friendly.

Results: We developed an online tool called AutoClassWeb, which provides an easy-to-use and simple web interface for Bayesian clustering with AutoClass. Input data are entered as TSV files and quality controlled. Results are provided in formats that ease further analyses with spreadsheet programs or with programming languages, such as Python or R. AutoClassWeb is implemented in Python and is published under the 3-Clauses BSD license. The source code is available at https://github.com/pierrepo/autoclassweb along with a detailed documentation.

Keywords: Autoclass; Bayesian; Clustering; Genomics; Machine learning; Proteomics.

MeSH terms

  • Bayes Theorem
  • Cluster Analysis
  • Genomics
  • Programming Languages*
  • Software*