Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

Front Physiol. 2016 Mar 8:7:75. doi: 10.3389/fphys.2016.00075. eCollection 2016.

Abstract

Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research.

Keywords: essential genes/proteins; machine learning; network topological features; prediction models; systems biology.

Publication types

  • Review