Abstract
Accurate identification of nucleic acid-binding residues is crucial for understanding protein-nucleic acid interactions, which play a key role in gene expression research and the discovery of regulatory mechanisms. Despite numerous computational efforts to address this challenge, achieving high accuracy remains difficult due to the complexity of extracting meaningful insights from proteins. Here, we introduce MegSite, a novel multimodal protein language model-informed method that integrates discriminative knowledge from protein sequence, structure, and function. This work presents the first integration of ESM3 multimodal features for nucleic acid-binding site prediction. MegSite significantly outperforms existing prediction methods, as evidenced by its performance on multiple independent test sets. The Matthews correlation coefficient values achieved by MegSite on DNA-129_Test, DNA-181_Test, RNA-117_Test, and RNA-285_Test are 0.567, 0.444, 0.411, and 0.421, representing the improvements of 2.72%, 7.66%, 1.22% and 6.58% over the second-best method separately. Notably, MegSite demonstrates robust performance even on proteins with low structural similarity, surpassing the previous structure-based methods. Furthermore, this method is seamlessly extendable to the predicted protein structure and a newly released RNA-binding residue test set with high accuracy, highlighting its broad applicability. Comprehensive experimental results reveal that the superior performance of MegSite is attributed to its effective integration of multimodal protein knowledge.