Cross-Species Genome-Wide Identification of Evolutionary Conserved MicroProteins

Genome Biol Evol. 2017 Mar 1;9(3):777-789. doi: 10.1093/gbe/evx041.

Abstract

MicroProteins are small single-domain proteins that act by engaging their targets into different, sometimes nonproductive protein complexes. In order to identify novel microProteins in any sequenced genome of interest, we have developed miPFinder, a program that identifies and classifies potential microProteins. In the past years, several microProteins have been discovered in plants where they are mainly involved in the regulation of development by fine-tuning transcription factor activities. The miPFinder algorithm identifies all up to date known plant microProteins and extends the microProtein concept beyond transcription factors to other protein families. Here, we reveal potential microProtein candidates in several plant and animal reference genomes. A large number of these microProteins are species-specific while others evolved early and are evolutionary highly conserved. Most known microProtein genes originated from large ancestral genes by gene duplication, mutation and subsequent degradation. Gene ontology analysis shows that putative microProtein ancestors are often located in the nucleus, and involved in DNA binding and formation of protein complexes. Additionally, microProtein candidates act in plant transcriptional regulation, signal transduction and anatomical structure development. MiPFinder is freely available to find microProteins in any genome and will aid in the identification of novel microProteins in plants and animals.

Keywords: metazoa; miPFinder; microProteins; plants; protein–protein interaction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arabidopsis / genetics
  • DNA-Binding Proteins / genetics
  • Evolution, Molecular*
  • Gene Duplication
  • Gene Expression Regulation, Plant
  • Genome, Plant*
  • Mutation
  • Phylogeny
  • Proteins / genetics*
  • Proteins / isolation & purification
  • Software*

Substances

  • DNA-Binding Proteins
  • Proteins