Abstract
Virtual screening is a critical step in computer-aided drug discovery. As synthetic databases and reaction schemes expand, the number of synthesizable compounds grows, making efficient screening essential. The success of ultra-large-scale virtual screening hinges on two factors: accuracy in binding affinity prediction and speed. We previously developed AK-Score2 for accurate protein-ligand binding prediction and V-Dock for estimating docking scores. Here, we present Docking of Millions (DoM), an ultra-large-scale virtual screening system integrating both methods. Iterative learning of V-Dock to approximate AK-Score2-based affinities accelerates screening by eliminating the need to dock all compounds, saving time and resources. In benchmarking, screening 5 million compounds against DDR1, c-kit, ASK1, NSD1, CREBBP, and PDE5, DoM consumed only 319 h on average, 12% of full library screening time. Also, it achieved an average retrieval rate of 89% in the top 100 compounds. Inhibition assays identified that 1 out of 27 and 4 out of 31 molecules suppressed ASK1 and DDR1 activity by more than 50% at 10 μM. Further validation of selected compounds identified that the lowest IC50 values are 1.96 μM and 788 nM for ASK1 and DDR1, respectively. These findings demonstrate DoM as a practical and efficient platform for real-world drug discovery.