GECluster: a novel protein complex prediction method

Biotechnol Biotechnol Equip. 2014 Jul 4;28(4):753-761. doi: 10.1080/13102818.2014.946700. Epub 2014 Oct 17.

Abstract

Identification of protein complexes is of great importance in the understanding of cellular organization and functions. Traditional computational protein complex prediction methods mainly rely on the topology of protein-protein interaction (PPI) networks but seldom take biological information of proteins (such as Gene Ontology (GO)) into consideration. Meanwhile, the environment relevant analysis of protein complex evolution has been poorly studied, partly due to the lack of high-precision protein complex datasets. In this paper, a combined PPI network is introduced to predict protein complexes which integrate both GO and expression value of relevant protein-coding genes. A novel protein complex prediction method GECluster (Gene Expression Cluster) was proposed based on a seed node expansion strategy, in which a combined PPI network was utilized. GECluster was applied to a training combined PPI network and it predicted more credible complexes than peer methods. The results indicate that using a combined PPI network can efficiently improve protein complex prediction accuracy. In order to study protein complex evolution within cells due to changes in the living environment surrounding cells, GECluster was applied to seven combined PPI networks constructed using the data of a test set including yeast response to stress throughout a wine fermentation process. Our results showed that with the rise of alcohol concentration, protein complexes within yeast cells gradually evolve from one state to another. Besides this, the number of core and attachment proteins within a protein complex both changed significantly.

Keywords: GO; PPI; core and attachment protein; evolution; gene expression value; protein complex.

Grants and funding

This work was supported by The National Natural Science Foundation of China [grant number 61373051], [grant number 61175023]; the Science and Technology Development Program of Jilin Province [grant number 20140204004GX], [grant number 20140520072JH]; Project of Science and Technology Innovation Platform of Computing and Software Science (985 Engineering), and The Key Laboratory for Symbol Computation and Knowledge Engineering of the National Education Ministry of China, The Fundamental Research Funds for the Central Universities, China [grant number 14QNJJ030].