CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning

CMMS-GCL:基于图对比学习的跨模态代谢稳定性预测

阅读:2

Abstract

MOTIVATION: Metabolic stability plays a crucial role in the early stages of drug discovery and development. Accurately modeling and predicting molecular metabolic stability has great potential for the efficient screening of drug candidates as well as the optimization of lead compounds. Considering wet-lab experiment is time-consuming, laborious, and expensive, in silico prediction of metabolic stability is an alternative choice. However, few computational methods have been developed to address this task. In addition, it remains a significant challenge to explain key functional groups determining metabolic stability. RESULTS: To address these issues, we develop a novel cross-modality graph contrastive learning model named CMMS-GCL for predicting the metabolic stability of drug candidates. In our framework, we design deep learning methods to extract features for molecules from two modality data, i.e. SMILES sequence and molecule graph. In particular, for the sequence data, we design a multihead attention BiGRU-based encoder to preserve the context of symbols to learn sequence representations of molecules. For the graph data, we propose a graph contrastive learning-based encoder to learn structure representations by effectively capturing the consistencies between local and global structures. We further exploit fully connected neural networks to combine the sequence and structure representations for model training. Extensive experimental results on two datasets demonstrate that our CMMS-GCL consistently outperforms seven state-of-the-art methods. Furthermore, a collection of case studies on sequence data and statistical analyses of the graph structure module strengthens the validation of the interpretability of crucial functional groups recognized by CMMS-GCL. Overall, CMMS-GCL can serve as an effective and interpretable tool for predicting metabolic stability, identifying critical functional groups, and thus facilitating the drug discovery process and lead compound optimization. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are freely available at https://github.com/dubingxue/CMMS-GCL.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。