Abstract
Photocatalysis is an environmentally conscious tool for removing contaminants from water. Novel photocatalytic materials are often measured on ability to degrade a small number of analytes, which may not be indicative of broader applicability. In this work, an experimental method dubbed high-throughput photocatalysis (HTP) is introduced to assay photocatalytic materials against a range of analytes in a time effective manner. HTP is modular; experimental parameters, including matrix, can be changed to fit a proposed application. The photodegradation of each analyte is attained in a consistent manner such that machine learning (ML) models can be applied to the obtained datasets. Three out of the box ML models-linear regression, random forest (RF), and neural network (NN)-are tasked with estimating the percentage removal as a function of irradiation time and molecular structure, as represented by Morgan fingerprints. Leave-out sets demonstrated that RF and NN models did not overfit the training data and reasonably estimated the degradation of unknown molecules. SHapley additive exPlanations values are utilized to correlate molecular substructures to the parent molecule's susceptibility to photocatalytic degradation. These correlations are used to generate heatmaps of estimated reactivity within molecules that corroborate reports in which dye degradation pathways were studied in detail.