An Overview on Predicting Protein Subchloroplast Localization by using Machine Learning Methods

Curr Protein Pept Sci. 2020;21(12):1229-1241. doi: 10.2174/1389203721666200117153412.

Abstract

The chloroplast is a type of subcellular organelle of green plants and eukaryotic algae, which plays an important role in the photosynthesis process. Since the function of a protein correlates with its location, knowing its subchloroplast localization is helpful for elucidating its functions. However, due to a large number of chloroplast proteins, it is costly and time-consuming to design biological experiments to recognize subchloroplast localizations of these proteins. To address this problem, during the past ten years, twelve computational prediction methods have been developed to predict protein subchloroplast localization. This review summarizes the research progress in this area. We hope the review could provide important guide for further computational study on protein subchloroplast localization.

Keywords: Protein; dataset; feature selection; machine learning method; protein sequence properties; subchloroplast localization.

Publication types

  • Review

MeSH terms

  • Amino Acid Sequence
  • Chloroplast Proteins / classification
  • Chloroplast Proteins / genetics*
  • Chloroplast Proteins / metabolism
  • Chloroplasts / genetics*
  • Chloroplasts / metabolism
  • Computational Biology / methods
  • Computational Biology / statistics & numerical data
  • Datasets as Topic
  • Gene Expression Regulation, Plant*
  • Machine Learning*
  • Models, Statistical*
  • Plants / genetics
  • Plants / metabolism
  • Protein Transport
  • Proteome / classification
  • Proteome / genetics*
  • Proteome / metabolism

Substances

  • Chloroplast Proteins
  • Proteome