Abstract
We developed SPARFlow, an open-source KNIME workflow for structure-activity or structure-property relationship (SAR/SPR) analyses. The workflow integrates data preprocessing, chemical structure curation, similarity network construction, maximum common substructure detection, R-group decomposition, activity cliff identification, and database modelability assessment. It implements established indices, including SALI, SARI, MODI*, and RMODI, to characterize SAR landscapes and assess dataset suitability for predictive modeling. SPARFlow was validated using four datasets with distinct chemical and endpoint characteristics: cruzain inhibitors, biased μ-opioid receptor agonists, pesticides, and carbonyl compounds with hydration constants.Scientific ContributionThis work introduces SPARFlow, an KNIME-integrated workflow that combines data curation, activity-cliff detection, and modelability assessment for SAR and SPR studies. The workflow provides a unified implementation of key SAR analyses within a single KNIME pipeline. It updates implementations of established metrics, including MODI* and RMODI, together with complementary indices such as SARI and SALI. It ensures consistent data flow across all modules.