Automatic feature engineering for catalyst design using small data without prior knowledge of target catalysis

Abstract

The empirical aspect of descriptor design in catalyst informatics, particularly when confronted with limited data, necessitates adequate prior knowledge for delving into unknown territories, thus presenting a logical contradiction. This study introduces a technique for automatic feature engineering (AFE) that works on small catalyst datasets, without reliance on specific assumptions or pre-existing knowledge about the target catalysis when designing descriptors and building machine-learning models. This technique generates numerous features through mathematical operations on general physicochemical features of catalytic components and extracts relevant features for the desired catalysis, essentially screening numerous hypotheses on a machine. AFE yields reasonable regression results for three types of heterogeneous catalysis: oxidative coupling of methane (OCM), conversion of ethanol to butadiene, and three-way catalysis, where only the training set is swapped. Moreover, through the application of active learning that combines AFE and high-throughput experimentation for OCM, we successfully visualize the machine's process of acquiring precise recognition of the catalyst design. Thus, AFE is a versatile technique for data-driven catalysis research and a key step towards fully automated catalyst discoveries.

期刊：	Communications Chemistry	影响因子：	5.900
时间：	2024	起止号：	2024 Jan 12;7(1):11.
doi：	10.1038/s42004-023-01086-y	研究方向：	信号转导

Automatic feature engineering for catalyst design using small data without prior knowledge of target catalysis

无需事先了解目标催化知识，利用小数据进行催化剂设计的自动特征工程

Abstract

特别声明