Molecular dynamics (MD) simulations are widely used to explore the conformational space of biological macromolecules. Advances in hardware, as well as in methods, make the generation of large and complex MD datasets much more common. Although different clustering and dimensionality reduction methods have been applied to MD simulations, there remains a need for improved strategies that handle nonlinear data and/or can be applied to very large datasets. We present an original implementation of the pivot-based version of the stochastic proximity embedding method aimed at large MD datasets using the dihedral distance as a metric. The advantages of the algorithm in terms of data storage and computational efficiency are presented, as well as the implementation realized. Application and testing through the analysis of a 200 ns accelerated MD simulation of a 35-residue villin headpiece is discussed. Analysis of the simulation shows the promise of this method to organize large conformational ensembles. © 2018 Wiley Periodicals, Inc.
Keywords: dimensionality reduction; molecular dynamics; stochastic proximity embedding; villin headpiece.
© 2018 Wiley Periodicals, Inc.