Abstract
While photocatalysis has emerged as a transformative tool in modern synthesis, AI-assisted reaction prediction faces significant challenges due to data limitations. We present PhotoCatDB - a curated, open-source database containing 26.7 K photocatalytic reactions with detailed mechanistic annotations, including 9.2 K multicomponent transformations. Leveraging this resource alongside 100 million molecular data points, we developed PhotoCat, a Transformer-based platform that achieves unprecedented accuracy in photocatalytic reaction prediction (82.6%), retrosynthesis (77.1%), and condition recommendation (88.5%). The platform's capabilities were experimentally validated through the discovery of four novel photocatalytic reactions with yields up to 75.3%. This integrated approach establishes a new paradigm for data-driven innovation in photocatalysis, bridging computational prediction with experimental validation to accelerate discovery in sustainable chemistry.