Abstract
Electrophilic compounds are bioactive components commonly found in foods that are capable of covalently modifying nucleophilic sites on biologically functional macromolecules. These compounds may elicit positive bioactivity or negative biotoxicity, posing significant challenges in terms of time and resource expenditure in the de novo characterization of their biological activity. In this study, we developed a database of 332 food-derived electrophilic compounds and used a semi-supervised k-nearest neighbors (KNN) machine learning model to predict their bioactivity. Molecular docking analysis identified the three chalcone compounds with the highest potential positive activity-4-hydroxyderricin (4HD), isoliquiritigenin (ISO), and butein. Furthermore, in cell experiments, treatment with 4HD, ISO, and butein significantly reduced reactive oxygen species (ROS) levels. An RT-qPCR analysis demonstrated that these chalcones significantly upregulated the mRNA expression of Nrf2 and its downstream antioxidant genes, including Nqo1, HO-1, Gsr, Gclc, and Gclm. ISO's cytoprotective and antioxidant effects were abolished following these findings, which highlight that 4HD, ISO, and butein are effective Nrf2 activators and suggest that comprehensive virtual technology is a promising strategy for identifying functional bioactive compounds.