Abstract
Co-folding models like AlphaFold have revolutionized protein complex structure prediction, yet their reliance on multiple sequence alignments (MSAs) limits their applicability on challenging targets such as antibody-antigen complexes. An alternative approach, structure-based protein-protein docking, predicts the bound complex structure from the unbound monomer structures without requiring MSAs. In this work, we propose a novel method to adapt co-folding models for structure-based protein-protein docking by replacing their template module with a docking module, followed by training end-to-end with a flow-matching objective. We apply our method to AlphaFold-Multimer (AF-M) using the OpenFold implementation and transform it into a generative docking model, which we name AF2Dock. We evaluate AF2Dock and various baseline methods on the PINDER-AF2 benchmark and an antibody/nanobody test set. When using non-holo inputs, AF2Dock shows competitive performance compared to other structure-based docking methods and, in the case of nanobody complexes, outperforms all other docking methods tested here. Although AF2Dock underperforms co-folding AF-M and AF3 in success rates when using non-holo inputs, it produces orthogonal predictions and successfully identifies correct structures for targets where co-folding models fail. Ablation studies confirm that full-parameter fine-tuning of the AF-M components is critical for performance and reveal that, surprisingly, the inclusion of ESM embeddings can hinder success rates in certain cases such as nanobody complexes. The code is available at https://github.com/Graylab/AF2Dock.