Data Checking of Asymmetric Catalysis Literature Using a Graph Neural Network Approach

利用图神经网络方法对不对称催化文献进行数据检查

阅读:4

Abstract

The range of chemical databases available has dramatically increased in recent years, but the reliability and quality of their data are often negatively affected by human-error fidelity. The size of chemical databases can make manual data curation/checking of such sets time consuming; thus, automated tools to help this process are highly desirable. Herein, we propose the use of Graph Neural Networks (GNNs) to identifying potential stereochemical misassignments in the primary asymmetric catalysis literature. Our method relies on the use of an ensemble of GNN models to predict the expected stereoselectivity of exemplars for a particular asymmetric reaction. When the majority of these models do not correlate to the reported outcome, the point is labeled as a possible stereochemical misassignment. Such identified cases are few in number and more easily investigated for their cause. We demonstrate the use of this approach to spot potential literature stereochemical misassignments in the ketone products resulting from catalytic asymmetric 1,4-addition of organoboron nucleophiles to Michael acceptors in two different databases, each one using a different family of chiral ligands (bisphosphine and diene ligands). Our results demonstrate that this methodology is useful for curation of medium-sized databases, speeding this process significantly compared to complete manual curation/checking. In the datasets investigated, human expert checking was reduced to 2.2% and 3.5% of the total data exemplars.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。