Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs

Bioinformatics. 2004 Jul 22;20(11):1746-58. doi: 10.1093/bioinformatics/bth163. Epub 2004 Mar 4.

Abstract

Summary: Biological and engineered networks have recently been shown to display network motifs: a small set of characteristic patterns that occur much more frequently than in randomized networks with the same degree sequence. Network motifs were demonstrated to play key information processing roles in biological regulation networks. Existing algorithms for detecting network motifs act by exhaustively enumerating all subgraphs with a given number of nodes in the network. The runtime of such algorithms increases strongly with network size. Here, we present a novel algorithm that allows estimation of subgraph concentrations and detection of network motifs at a runtime that is asymptotically independent of the network size. This algorithm is based on random sampling of subgraphs. Network motifs are detected with a surprisingly small number of samples in a wide variety of networks. Our method can be applied to estimate the concentrations of larger subgraphs in larger networks than was previously possible with exhaustive enumeration algorithms. We present results for high-order motifs in several biological networks and discuss their possible functions.

Availability: A software tool for estimating subgraph concentrations and detecting network motifs (mfinder 1.1) and further information is available at http://www.weizmann.ac.il/mcb/UriAlon/

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Animals
  • Artificial Intelligence*
  • Caenorhabditis elegans / physiology*
  • Computer Simulation
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / physiology*
  • Models, Neurological*
  • Nerve Net / physiology*
  • Sample Size
  • Signal Transduction / physiology*