Unrolr: Structural analysis of protein conformations using stochastic proximity embedding

Jérôme Eberhardt; Roland H Stote; Annick Dejaegere

doi:10.1002/jcc.25599

Unrolr: Structural analysis of protein conformations using stochastic proximity embedding

J Comput Chem. 2018 Nov 15;39(30):2551-2557. doi: 10.1002/jcc.25599.

Authors

Jérôme Eberhardt¹, Roland H Stote¹, Annick Dejaegere¹

Affiliation

¹ Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France.

PMID: 30447084
DOI: 10.1002/jcc.25599

Abstract

Molecular dynamics (MD) simulations are widely used to explore the conformational space of biological macromolecules. Advances in hardware, as well as in methods, make the generation of large and complex MD datasets much more common. Although different clustering and dimensionality reduction methods have been applied to MD simulations, there remains a need for improved strategies that handle nonlinear data and/or can be applied to very large datasets. We present an original implementation of the pivot-based version of the stochastic proximity embedding method aimed at large MD datasets using the dihedral distance as a metric. The advantages of the algorithm in terms of data storage and computational efficiency are presented, as well as the implementation realized. Application and testing through the analysis of a 200 ns accelerated MD simulation of a 35-residue villin headpiece is discussed. Analysis of the simulation shows the promise of this method to organize large conformational ensembles. © 2018 Wiley Periodicals, Inc.

Keywords: dimensionality reduction; molecular dynamics; stochastic proximity embedding; villin headpiece.

Publication types

News
Research Support, Non-U.S. Gov't

MeSH terms

Databases, Protein
Molecular Dynamics Simulation*
Protein Conformation*
Proteins / chemistry*
Stochastic Processes*

Substances

Proteins