SOMRuler: a novel interpretable transmembrane helices predictor

IEEE Trans Nanobioscience. 2011 Jun;10(2):121-9. doi: 10.1109/TNB.2011.2160730. Epub 2011 Jul 7.

Abstract

Transmembrane helices (TMH) identification is one of the most important steps in membrane protein structure prediction. Existing TMH predictors tend to pursue accurate computational models without carefully considering the interpretability of these models and thus act as a black box. In this paper, a novel TMH predictor called SOMRuler with excellent interpretability while possessing high prediction accuracy is presented. The SOMRuler uses a self-organizing map (SOM) to learn helices distribution knowledge, which is encoded in the codebook vectors of the trained SOM, from the training samples. Human interpretable fuzzy rules are then extracted from the codebook vectors of the trained SOM. By extracting fuzzy rules from the learned knowledge rather than the original training samples, on the one hand, the computational burden of extracting fuzzy rules can be greatly reduced; on the other hand, the reliability of the extracted rules can also be enhanced since noise contained in the original samples can be smoothened by the learning procedure of SOM. The validity of the fuzzy rules extracted by SOMRuler is qualitatively and quantitatively analyzed. Experimental results on the benchmark dataset show that the SOMRuler outperforms most existing popular TMH predictors and is flexible to suite for a wide variety of problems in bioinformatics. The SOMRuler software is implemented by Java and Matlab and is available for academic use at: http://www.csbio.sjtu.edu.cn/bioinf/SOMRuler/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Data Mining
  • Fuzzy Logic
  • Membrane Proteins / chemistry*
  • Protein Structure, Secondary
  • Proteomics / methods*
  • Software*

Substances

  • Membrane Proteins