IMPRESSION generation 2 - accurate, fast and generalised neural network model for predicting NMR parameters in place of DFT

第二代 IMPRESSION 模型——一种准确、快速且通用的神经网络模型,用于预测 NMR 参数,可替代 DFT 方法。

阅读:1

Abstract

Predicting 3D-aware Nuclear Magnetic Resonance (NMR) properties is critical for determining the 3D structure and dynamics, both stereochemical and conformational, of molecules in solution. Existing tools for such predictions are limited, being either relatively slow quantum chemical methods such as Density Functional Theory (DFT), or niche parameterised empirical or machine learning methods that only predict a single parameter type, often across only a limited chemical space. We present here IMPRESSION-Generation 2 (G2), a transformer-based neural network which can be used as a much faster alternative to high level DFT calculations in computational workflows of multiple classes of NMR parameter simultaneously, with time-savings of several orders of magnitude. IMPRESSION-G2 is the first system that simultaneously predicts all NMR chemical shifts, as well as scalar couplings for (1)H, (13)C, (15)N and (19)F nuclei up to 4 bonds apart, in a single prediction event starting from a 3D molecular structure. Rapid NMR predictions take <50 ms to predict on average ∼5000 chemical shifts and scalar couplings per molecule, which is approximately 10(6)-times faster than DFT-based NMR predictions starting from a 3D structure. When combined with fast GFN2-xTB geometry optimisations to generate the 3D input structures themselves in just a few seconds, a complete workflow for NMR predictions on a new molecule is 10(3)-10(4) times faster than a wholly DFT-based workflow for this. The accuracy of this multi-parameter predictor in reproducing DFT-quality results for a wide chemical space of organic molecules up to ∼1000 g mol(-1) containing C, H, N, O, F, Si, P, S, Cl, Br exceeds that of existing state-of-the-art empirical or machine learning systems (∼0.07 ppm for (1)H chemical shifts, ∼0.8 ppm for (13)C chemical shifts, <0.15 Hz for (3) J (HH) scalar coupling constants) and, critically, it also demonstrates generalisability when tested against molecules from sources that are completely independent of its own training data. When compared to experimental NMR data for ∼5000 compounds, IMPRESSION-G2 gives results in minutes on a standard laptop which are almost indistinguishable from DFT results that took days on a large scale High Performance Computing system. This accuracy and speed of IMPRESSION-G2 coupled to GFN-xTB shows that it can be used to simply replace DFT for predicting 3D-aware NMR parameters inside the wide chemical space of its training data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。