Abstract
Computational toxicology plays an important role in risk assessment and drug safety. The field has been traditionally dominated by Quantitative Structure-Activity Relationships (QSARs), which predict toxicological effects based solely on chemical structure. Although QSARs have achieved successes, their structure reliance limits drug toxicity predictions, where small structural modifications may cause major toxicity changes. Advances in artificial intelligence (AI), especially text embedding and generative AI, provide an opportunity to enhance toxicity predictions by leveraging broader chemical knowledge and its integration with structural data. In this study, we propose a novel framework, Quantitative Knowledge-Activity Relationships (QKARs), which predicts toxicity using domain-specific knowledge. We developed QKAR models for two drug toxicity endpoints, drug-induced liver injury (DILI) and drug-induced cardiotoxicity (DICT), using three different knowledge representations with varying levels of knowledge. The representations based on comprehensive knowledge of the drugs yielded better prediction than those with simpler knowledge. Five machine learning algorithms of distinct complexity were applied in QKAR models, and we observed little association between model complexity and performance. Further, we evaluated QKARs against QSARs on the same endpoints using identical datasets. We found that QKARs consistently outperformed QSARs for DILI and DICT. Notably, QKARs demonstrated better capability than QSARs in differentiating drugs with similar structures but different liver toxicity profiles. We also investigated integrating knowledge-based and structure-based representations, Q(K + S)ARs, for further enhanced prediction accuracy. Our findings demonstrate the potential of QKARs as a robust alternative to QSARs, offering additional opportunities in drug toxicity assessments by leveraging both domain-specific knowledge and structural data.