Functional annotation of intrinsically disordered domains by their amino acid content using IDD Navigator

Pac Symp Biocomput. 2012:164-75.

Abstract

Function prediction of intrinsically disordered domains (IDDs) using sequence similarity methods is limited by their high mutability and prevalence of low complexity regions. We describe a novel method for identifying similar IDDs by a similarity metric based on amino acid composition and identify significantly overrepresented Gene Ontology (GO) and Pfam domain annotations within highly similar IDDs. Applications and extensions of the proposed method are discussed, in particular with respect to protein functional annotation. We test the predicted annotations in a large-scale survey of IDDs in mouse and find that the proposed method provides significantly greater protein coverage in terms of function prediction than traditional sequence alignment methods like BLAST. As a proof of concept we examined several disorder-containing proteins: GRA15 and ROP16, both encoded in the parasitic protozoa T. gondii; Cyclon, a mostly uncharacterized protein involved in the regulation of immune cell death; STIM1, a protein essential for regulating calcium levels in the endoplasmic reticulum. We show that the overrepresented GO terms are consistent with recently-reported biological functions. We implemented the method in the web server IDD Navigator. IDD Navigator is available at http://sysimm.ifrec.osaka-u.ac.jp/disorder/beta.php.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / analysis
  • Animals
  • Calcium Channels
  • Computational Biology
  • Databases, Protein
  • Membrane Glycoproteins / chemistry
  • Membrane Glycoproteins / genetics
  • Mice
  • Molecular Sequence Annotation
  • Nuclear Proteins / chemistry
  • Nuclear Proteins / genetics
  • Protein Structure, Tertiary
  • Protein-Tyrosine Kinases / chemistry
  • Protein-Tyrosine Kinases / genetics
  • Proteins / chemistry*
  • Proteins / genetics
  • Protozoan Proteins / chemistry
  • Protozoan Proteins / genetics
  • Sequence Alignment
  • Software*
  • Stromal Interaction Molecule 1
  • Toxoplasma / chemistry
  • Toxoplasma / genetics

Substances

  • Amino Acids
  • Calcium Channels
  • Membrane Glycoproteins
  • Nuclear Proteins
  • Proteins
  • Protozoan Proteins
  • Stim1 protein, mouse
  • Stromal Interaction Molecule 1
  • cyclon protein, mouse
  • Protein-Tyrosine Kinases
  • Rop16 protein, Toxoplasma gondii