This study presents a novel approach to (1)H NMR-based machine learning (ML) models for predicting logD using computer-generated (1)H NMR spectra. Building on our previous work, which integrated experimental (1)H NMR data, this study addresses key limitations associated with experimental measurements, such as sample stability, solvent variability, and extensive processing, by replacing them with fully computational workflows. Benchmarking across various density functional theory (DFT) functionals and basis sets highlighted their limitations, with DFT-based models showing relatively high RMSE values (average CHI logD of 1.12, lowest at 0.96) and extensive computational demands, limiting their usefulness for large-scale predictions. In contrast, models trained on predicted (1)H NMR spectra by NMRshiftDB2 and JEOL JASON achieved RMSE values as low as 0.76, compared to 0.88 for experimental spectra. Further analysis revealed that mixing experimental and predicted spectra did not enhance accuracy, underscoring the advantage of homogeneous datasets. Validation with external datasets confirmed the robustness of our models, showing comparable performance to commercial software like Instant JChem, thus underscoring the reliability of the proposed computational workflow. Additionally, using normalized RMSE (NRMSE) proved essential for consistent model evaluation across datasets with varying data scales. By eliminating the need for experimental input, this workflow offers a widely accessible, computationally efficient pipeline, setting a new standard for ML-driven chemical property predictions without experimental data constraints.
From NMR to AI: Do We Need (1)H NMR Experimental Spectra to Obtain High-Quality logD Prediction Models?
阅读:15
作者:Leniak Arkadiusz, PietruÅ Wojciech, Åwiderska Aleksandra, Kurczab RafaÅ
| 期刊: | Journal of Chemical Information and Modeling | 影响因子: | 5.300 |
| 时间: | 2025 | 起止号: | 2025 Mar 24; 65(6):2924-2939 |
| doi: | 10.1021/acs.jcim.4c02145 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
