Abstract
Conformational changes underlie many aspects of protein function, yet current structure prediction tools remain limited in their ability to systematically sample structural ensembles. Here, we present ConforPSSP and ConforFold, a combined framework that integrates secondary-structure sampling into a deep learning-based prediction to recover multiple protein conformational states. ConforPSSP employs a transformer model trained on multi-residue fragments to generate diverse 8-state protein secondary structure predictions (PSSPs), which are then used to condition a retrained OpenFold model (ConforFold). ConforFold achieved state-of-the-art performance in conformer recovery. On our test dataset of protein samples with two alternative conformations, it correctly identified both conformers in 84% of cases at TM-scores ≥0.8, outperforming AlphaFlow (75.4%), which uses diffusion-based sampling, and Cfold, which relies on MSA clustering. It outperformed BioEmu, a novel method that emulates MD simulation results, in cases where secondary structures between conformers differ significantly (83% and 76% of cases for ConforFold and BioEmu, respectively). These results establish ConforFold as a broadly applicable framework for modeling structural ensembles. It recovers conformations inaccessible to MSA-based subsampling or diffusion models by explicitly integrating secondary structure, offering a new avenue for investigating conformational heterogeneity, mechanistic transitions, and the structural basis of protein function.