Abstract
Drug-target interaction (DTI) prediction is a critical task in drug discovery. However, the differences in size between drugs and proteins present significant challenges in accurately predicting binding sites. Additionally, the issue of modality imbalance, which arises from modality learning biases, undermines the contribution of multimodal representations to DTI prediction. To address these challenges, we propose DDGR-DTI, which is based on an innovative Decoupled Dual-Granularity Framework and a Rebalanced pyramid network (RPN). This framework divides the DTI task into two levels of granularity. The macro level, which decomposes it into subtasks based on modality, and the micro level further decomposes the representation within each subtask. Furthermore, the dual-stream attention module is utilized to perform fine-grained substructure-level interactions within each subtask, thereby enabling accurate identification of binding sites. Simultaneously, we employ an RPN, which effectively alleviates the bias towards the dominant modality in multimodal fusion through a hierarchical aggregation mechanism, emphasizing the synergistic advantages brought by modality balance. Benchmark results demonstrate that DDGR-DTI outperforms existing state-of-the-art models in both prediction performance and generalization ability. Availability: The source code and dataset can be found at https://github.com/ZZUzy/DDGR-DTI.