Abstract
Traditional QSAR/QSPR and inverse QSAR/QSPR methods often assume that chemical properties are dictated by single molecules, overlooking the influence of multibody molecular interactions and environmental factors. In this paper, we introduce a novel inverse QSAR/QSPR framework that can capture the combined effects of multiple interacting molecules (e.g., small molecules or polymers) and experimental conditions on property values. To the best of our knowledge, this is the first machine learning-based inverse QSAR/QSPR framework to explicitly integrate multiple interacting molecules and environmental factors. We design a feature function to integrate the information on multiple interacting molecules and the environment. Specifically, for the property Flory-Huggins χ-parameter, which characterizes the thermodynamic properties between the solute and the solvent, and varies in temperatures, we demonstrate through computational experimental results that our approach can achieve a competitively high learning performance compared to existing works on predicting χ-parameter values, while inferring the solute polymers with up to 50 non-hydrogen atoms in their monomer forms in a relatively short time. A comparison study with the simulation software J-OCTA demonstrates that the polymers inferred by our methods are of high quality. All implemented source codes are available at https://github.com/ku-dml/mol-infer/tree/master/chi-parameter.