Abstract
Efficient exploration of chemical space is an essential component of modern generative drug design. Herein, we introduce ChemBang, a computational engine that grows small molecules based on chemical transformations extracted by matched molecular pair analysis of all structures available in catalogues of synthesized molecules. Each chemical transformation is mapped onto its associated atomic environment defined as the substructure within a three-atom radius from the transformation site. Unsupervised chemical evolution is then performed in cycles by systematically applying chemical transformations to all exposed atomic environments present in a seed structure. Multiple physicochemical properties and substructural alerts are incorporated to effectively guide the generation of drug-like synthetically accessible molecules. As a use case, the generation of the Erdafitinib structure from any of its three ring systems (pyrazole, benzene and quinoxaline), and the evolution of the property distributions from all molecules generated in each cycle, are discussed in detail. The ability to explore the chemical space of pharmaceutical relevance is shown by successfully generating the exact chemical structure of 95.3% of all 2,809 small-molecule ATC drugs from their constituting fragments.