Abstract
Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 10(12) reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.