Abstract
Accurately determining the binding affinity of a ligand with a protein is important for drug design, development, and screening. With the advent of accessible protein structure prediction methods such as AlphaFold, predicted protein 3D structures are readily available; however, scalable methods for predicting binding affinity currently do not take full advantage of 3D protein information. Here, we present CASTER-DTA (Cross-Attention with Structural Target Equivariant Representations for Drug-Target Affinity), which uses an equivariant graph neural network (GNN) to learn more robust protein representations alongside a standard GNN to learn molecular representations to predict DTA. We augment these representations by incorporating an attention-based mechanism between protein residues and drug atoms to improve interpretability. We show that CASTER-DTA represents a state-of-the-art improvement on multiple benchmarks for predicting DTA, and that it generates novel insights for several related tasks. We then apply CASTER-DTA to create a large resource of the binding affinities of every drug approved by the U.S. Food and Drug Administration (FDA) against every protein in the human proteome and make these predictions freely available for download. We also make available a web server for researchers to apply a pretrained CASTER-DTA model for predicting binding affinities between arbitrary proteins and drugs.