Abstract
Predicting drug-target interactions (DTI) with graph neural networks (GNNs) is hindered by their lack of interpretability. To address this, we benchmark four explainable artificial intelligence (XAI) attribution methods on GNN models trained for kinase and G-protein-coupled receptors (GPCR) targets. We assess the methods' consistency through atom-level intersection over union (IoU) and validate their biological relevance by mapping attributed atoms to three-dimensional (3D) protein-ligand structures. While consistency across methods was modest, consensus attributions were highly enriched for atoms directly contacting the binding pocket─up to 76% within 2 Å in the kinase-inhibitor complexes. Notably, these attributed atoms were frequently found contacting experimentally important regulatory residues such as those in the DFG motif. This indicates that XAI methods, despite their disagreements, can identify chemically meaningful ligand features, providing a foundation for developing more interpretable GNNs in drug discovery.