Abstract
In the research and industrial production of chemical synthesis, identifying suitable general reaction conditions is critical. However, chemical reactions typically involve multiple factors, including catalysts, solvents, temperature, and reaction time, and the optimal conditions for a single substrate are often not applicable to others. To address this limitation, this study proposes a bidirectional general reaction condition optimization framework that integrates the multiarmed bandit algorithm and regression model. The framework first utilizes the multiarmed bandit algorithm to dynamically balance exploration and exploitation in reaction condition selection. Then, a regression model, combined with molecular representation and a per-substrate selective model training strategy, is used to select substrates, thereby improving the accuracy of the general reaction condition optimization. Experimental results demonstrate that this framework exhibits high efficiency and strong adaptability across diverse reaction data sets. In two data sets with comparable numbers of substrates and reaction conditions, the framework achieves accuracy improvements of 20% and 15% over state-of-the-art models. Furthermore, the framework maintains robust optimization performance in two specialized data setsone featuring extensive substrate combinations and the other containing numerous condition combinationsfurther validating its effectiveness. We propose a bidirectional general reaction condition optimization framework that integrates the multiarmed bandit algorithm and regression model. The framework first utilizes the multiarmed bandit algorithm to dynamically balance exploration and exploitation in reaction condition selection. Then, a regression model, combined with molecular representation and a per-substrate selective model training strategy, is used to select substrates, thereby improving the accuracy of the general reaction condition optimization. Experimental results demonstrate that this framework achieves high efficiency and strong adaptability across diverse reaction data sets.