Abstract
Understanding the dynamic conformations of proteins is important for rational drug discovery. While molecular dynamics (MD) simulation is the primary tool for this purpose, it is both resource- and time-consuming. Recent advances in deep learning offer an attractive alternative by generating conformational ensembles directly from protein sequences. However, the scope of applying such models to protein dynamics studies remains underexplored. Here, we tested the performance of a representative model, BioEmu, across several tasks related to protein dynamics. Our results show that BioEmu can not only generate multiple conformations but also effectively reproduce fundamental properties including residue flexibility, motion correlations, and local residue contacts. However, it fails to predict a mutation-induced shift in conformational distribution and exhibits a preference for higher-energy conformations over lower-energy ones in some cases, indicating that it does not reproduce a right Boltzmann-weighted ensemble. Furthermore, the BioEmu-generated conformations provide only limited improvement in ensemble docking. These findings delineate the current capabilities and limitations of sequence-based generative models for conformational sampling. Also, they highlight several directions for future development-that further energy-based fine-tuning is needed for tasks related to conformational distributions and atom-level generative model is required to study the intermolecular relationship.