Abstract
Drug-target interaction (DTI) prediction is essential for drug discovery and repurposing. To overcome the limitations of current DTI prediction methods that rely on single-source encoding and inadequately fuse multimodal information, this study proposes a DTI prediction method based on multimodal information fusion (MIF-DTI) and further designs an ensemble version (MIF-DTI-B). MIF-DTI encodes the SMILES sequences of drugs and the amino acid sequences of targets via a sequence encoding module to extract their 1D sequence features. It conducts dual-view representation encoding on the hierarchical molecular graphs of drugs and the contact graphs of targets through a graph encoding module, aiming to capture their 2D topological structure information. A decoding module is utilized to fuse information from different modalities. MIF-DTI-B ensembles several MIF-DTI models through cross-validation strategy to improve predictive accuracy. This study evaluates the proposed models on three publicly accessible DTI datasets. Experimental results demonstrate that fully integrating multimodal information enables both MIF-DTI and MIF-DTI-B to consistently outperform state-of-the-art methods.