Abstract
Identifying potent lead molecules for specific targets remains a major bottleneck in drug discovery. As structural information about proteins becomes increasingly available, ultra-large virtual screenings (ULVSs) which computationally evaluate billions of molecules offer a powerful way to accelerate early-stage drug discovery. Here, we introduce AdaptiveFlow, an open-source platform designed to make ULVSs more accessible, scalable, and efficient. AdaptiveFlow provides free access to a screening-ready version of the Enamine REAL Space (1), the largest library of ready-to-dock, drug-like molecules, containing 69 billion compounds that we prepared using the ligand preparation module of the platform. A key innovation of the platform is its use of a multi-dimensional grid of molecular properties, which helps researchers explore and prioritize chemical space more effectively and reduce the computational costs by a factor of approximately 1000. This grid forms the basis of a new method for identifying promising regions of chemical space, enabling systematic exploration and prioritization of compound libraries. An optional active learning component can further accelerate this process by adaptively steering the search toward molecules most likely to bind a given target. To support a broad range of applications, AdaptiveFlow is compatible with over 1,500 docking protocols. The platform achieves near-linear scaling on up to 5.6 million CPUs in the AWS Cloud, setting a new benchmark for large-scale cloud computing in drug discovery. Using this approach, we identified nanomolar inhibitors of two disease-relevant targets: ferroptosis suppressor protein 1 (FSP1) and poly(ADP-ribose) polymerase 1 (PARP-1) (2, 3). By leveraging newly solved crystal structures of FSP1 in complex with NAD(+), FAD, and coenzyme Q(1), we validated these hits experimentally and determined the co-crystal structures of FSP1 bound to small-molecule inhibitors, enabling insights into inhibitor binding mechanisms previously unknown. With its high scalability, flexibility, and open accessibility, AdaptiveFlow offers a powerful new resource for discovering and optimizing drug candidates at an unprecedented scale and speed.