Analyses of structural dynamics of biomolecules hold great promise to deepen the understanding of and ability to construct complex molecular systems. To this end, both experimental and computational means are available, such as fluorescence quenching experiments or molecular dynamics simulations, respectively. We argue that while seemingly disparate, both fields of study have to deal with the same type of data about the same underlying phenomenon of conformational switching. Two central challenges typically arise in both contexts: (i) the amount of obtained data is large, and (ii) it is often unknown how many distinct molecular states underlie these data. In this study, we build on the established idea of Markov state modeling and propose a generative, Bayesian nonparametric hidden Markov state model that addresses these challenges. Utilizing hierarchical Dirichlet processes, we treat different meta-stable molecule conformations as distinct Markov states, the number of which we then do not have to seta priori. In contrast to existing approaches to both experimental as well as simulation data that are based on the same idea, we leverage a mean-field variational inference approach, enabling scalable inference on large amounts of data. Furthermore, we specify the model also for the important case of angular data, which however proves to be computationally intractable. Addressing this issue, we propose a computationally tractable approximation to the angular model. We demonstrate the method on synthetic ground truth data and apply it to known benchmark problems as well as electrophysiological experimental data from a conformation-switching ion channel to highlight its practical utility.
Keywords: Bayesian nonparametrics; conformational switching; molecular dynamics; variational inference.
Creative Commons Attribution license.