Abstract
Structural dynamics play critical roles for the biological activity of protein molecules. Characterizing the inherent conformational landscapes of these macromolecules remains a major experimental and computational challenge, particularly for heterogeneous and transient systems such as intrinsically disordered proteins, membrane-associated assemblies and disordered fuzzy coats of amyloid aggregates. In this context, coarse-grained (CG) molecular dynamics simulations have enabled accessing to extended time scales and large system sizes, however, their reduced resolution and simplified interaction potentials often limit the structural accuracy. Here, we introduce Martini3-NMR, an integrative framework that incorporates nuclear magnetic resonance (NMR) observables directly into CG protein force fields. Using artificial neural networks to model NMR chemical shifts at the CG level, and integrating these data with NOE restraints, we define an approach to significantly enhance the accuracy of CG simulations while maintaining their elevated sampling efficiency, thereby resulting in a substantially improved description of protein conformational ensembles. We demonstrate the broad applicability of Martini3-NMR by generating CG ensembles for a range of systems involved in diverse biological processes such as protein folding, oligomer disassembly within lipid bilayers and conformational transitions of disordered fuzzy regions decorating amyloid fibril surfaces, which were found to display condensate-like properties. By enabling an experimentally driven and computationally efficient exploration of protein conformational landscapes, Martini3-NMR provides a novel general framework for investigating dynamic, heterogeneous and multiscale biomolecular processes. This approach opens to significant new opportunities for extending CG simulations toward a more quantitative understanding of the relationship between molecular structure, dynamics and biological function.