Protein aggregation, structural disorder and RNA-binding ability: a new approach for physico-chemical and gene ontology classification of multiple datasets

BMC Genomics. 2015 Dec 16:16:1071. doi: 10.1186/s12864-015-2280-z.

Abstract

Background: Comparison between multiple protein datasets requires the choice of an appropriate reference system and a number of variables to describe their differences. Here we introduce an innovative approach to discriminate multiple protein datasets (multiCM) and to measure enrichments in gene ontology terms (cleverGO) using semantic similarities.

Results: We illustrate the powerfulness of our approach by investigating the links between RNA-binding ability and other protein features, such as structural disorder and aggregation, in S. cerevisiae, C. elegans, M. musculus and H. sapiens. Our results are in striking agreement with available experimental evidence and unravel features that are key to understand the mechanisms regulating cellular homeostasis.

Conclusions: In an intuitive way, multiCM and cleverGO provide accurate classifications of physico-chemical features and annotations of biological processes, molecular functions and cellular components, which is extremely useful for the discovery and characterization of new trends in protein datasets. The multiCM and cleverGO can be freely accessed on the Web at http://www.tartaglialab.com/cs_multi/submission and http://www.tartaglialab.com/GO_analyser/universal . Each of the pages contains links to the corresponding documentation and tutorial.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Molecular Structure*
  • Protein Aggregation, Pathological*
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / metabolism*
  • Software
  • Solubility
  • Structure-Activity Relationship

Substances

  • RNA-Binding Proteins