Quantifying the conformational ensembles of biomolecules is fundamental to describing mechanisms of processes such as ligand binding and allosteric regulation. Accurate quantification of these ensembles remains a challenge for all but the simplest molecules. To overcome the ensemble sampling challenge, enhanced sampling approaches, such as metadynamics, have become widespread in molecular simulation; however, the non-uniform frame weights that result from many of these approaches present an additional challenge to ensemble quantification techniques such as Markov State Modeling or structural clustering. Here, we present a rigorous inclusion of non-uniform frame weights into a structural clustering method entitled shapeGMM. The shapeGMM method fits a Gaussian mixture model to particle positions, and here we advance that approach by incorporating non-uniform frame weights in the estimates of all parameters of the model. The resulting models are high dimensional probability densities for the unbiased systems from which we can compute important thermodynamic properties such as relative free energies and configurational entropy. The accuracy of this approach is demonstrated by the quantitative agreement between GMMs computed by Hamiltonian reweighting and direct simulation of a coarse-grained helix model system. Furthermore, the relative free energy computed from a high dimensional probability density of alanine dipeptide reweighted from a metadynamics simulation quantitatively reproduces the metadynamics free energy in the basins. Finally, the method identifies hidden structures along the actin globular to filamentous-like structural transition from a metadynamics simulation on a linear discriminant analysis coordinate trained on GMM states, demonstrating the broad applicability of combining our prior and new methods, and illustrating how structural clustering of biased data can lead to biophysical insight. Combined, these results demonstrate that frame-weighted shapeGMM is a powerful approach to quantify biomolecular ensembles from biased simulations.
Quantifying Unbiased Conformational Ensembles from Biased Simulations Using ShapeGMM
Subarna Sasmal, Triasha Pal, Glen M. Hocky*, and Martin McCullagh*
J. Chem. Theory Comput., 20 (9), 3492–3502 (2024)
Published