Learning dynamical information from static protein and sequencing data

Nat Commun. 2019 Nov 26;10(1):5368. doi: 10.1038/s41467-019-13307-x.

Abstract

Many complex processes, from protein folding to neuronal network dynamics, can be described as stochastic exploration of a high-dimensional energy landscape. Although efficient algorithms for cluster detection in high-dimensional spaces have been developed over the last two decades, considerably less is known about the reliable inference of state transition dynamics in such settings. Here we introduce a flexible and robust numerical framework to infer Markovian transition networks directly from time-independent data sampled from stationary equilibrium distributions. We demonstrate the practical potential of the inference scheme by reconstructing the network dynamics for several protein-folding transitions, gene-regulatory network motifs, and HIV evolution pathways. The predicted network topologies and relative transition time scales agree well with direct estimates from time-dependent molecular dynamics data, stochastic simulations, and phylogenetic trees, respectively. Owing to its generic structure, the framework introduced here will be applicable to high-throughput RNA and protein-sequencing datasets, and future cryo-electron microscopy (cryo-EM) data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Evolution, Molecular
  • Gene Regulatory Networks*
  • HIV / genetics
  • Markov Chains
  • Molecular Dynamics Simulation
  • Protein Folding*
  • Proteins* / chemistry
  • Proteins* / genetics

Substances

  • Proteins