Machine learning analysis of molecular dynamics properties influencing drug solubility

利用机器学习分析影响药物溶解度的分子动力学性质

阅读:2

Abstract

Solubility is critical in drug discovery and development, as it significantly influences a medication's bioavailability and therapeutic efficacy. Understanding solubility at the early stages of drug discovery is essential for minimizing resource consumption and enhancing the likelihood of clinical success via prioritizing compounds with optimal solubility. Molecular dynamics (MD) simulation is a powerful computational tool for modeling various physicochemical properties, particularly solubility. MD simulations offer a detailed perspective on molecular interactions and dynamics, providing insights into the factors influencing solubility. This study aims to statistically examine the impact of ten MD-derived properties, along with octanol-water partition coefficient (logP), one of the most influential experimental properties, on the aqueous solubility of drugs using Machine Learning (ML) techniques. To achieve this, a dataset comprising 211 drugs from diverse classes was compiled from the literature. These drugs were subjected to MD simulation, from which relevant properties were extracted and selected as features. Additionally, the corresponding logP from previous studies was incorporated into the analysis. Through rigorous analysis, the properties with the most significant influence on solubility were identified and subsequently used as input features for four ensemble machine learning algorithms: Random Forest, Extra Trees, XGBoost, and Gradient Boosting. The results indicate that seven properties, logP, Solvent Accessible Surface Area (SASA), Coulombic_t, LJ, Estimated Solvation Free energies (DGSolv), Root Mean Square Deviation (RMSD), and Average number of solvents in Solvation Shell (AvgShell) are highly effective in predicting solubility, exhibiting performance comparable to predictive models based on structural features. The Gradient Boosting algorithm achieved the best performance with a predictive R(2) of 0.87 and an RMSE of 0.537 in test set. This research underscores the potential of integrating MD simulations with ML methodologies to improve the accuracy and efficiency of aqueous solubility predictions in drug development.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。