Enhancing Unconditional Molecule Generation via Online Knowledge Distillation of Scaffolds

通过在线知识提炼骨架增强无条件分子生成

阅读:1

Abstract

Generating new drug-like molecules is an essential aspect of drug discovery, and deep learning models significantly accelerate this process. Language models have demonstrated great potential in generating novel and realistic SMILES representations of molecules. Molecular scaffolds, which serve as the key structural foundation, can facilitate language models in discovering chemically feasible and biologically relevant molecules. However, directly using scaffolds as prior inputs can introduce bias, thereby limiting the exploration of novel molecules. To combine the above advantages and address the limitation, we incorporate molecular scaffold information into language models via an Online knowledge distillation framework for the unconditional Molecule Generation task (OMG), which consists of a GPT model that generates SMILES strings of molecules from scratch and a Transformer model that generate SMILES strings of molecules from scaffolds. The knowledge of scaffolds and complete molecular structures is deeply integrated through the mutual learning of the two models. Experimental results on two well-known molecule generation benchmarks show that the OMG framework enhances both the validity and novelty of the GPT-based unconditional molecule generation model. Furthermore, comprehensive property-specific evaluation results indicate that the generated molecules achieve a favorable balance across multiple chemical properties and biological activity, demonstrating the potential of our method in discovering viable drug candidates.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。