Interpretable machine learning rationalizes carbonic anhydrase inhibition via conformal and counterfactual prediction

可解释的机器学习通过共形预测和反事实预测来解释碳酸酐酶抑制作用

阅读:1

Abstract

Human carbonic anhydrase (hCA) isoforms IX and XII are promising anticancer targets. Yet, their selective inhibition remains elusive due to close similarity with the abundant hCA II, whose off-target inhibition causes harmful side effects. Here, we introduce an interpretable machine learning framework to predict inhibition across hCA II, IX, and XII. To address this issue, our approach combines rigorous data curation, systematic benchmarking of classical and deep learning models, and integration of conformal prediction for uncertainty quantification with counterfactual explanations for molecular interpretability. After extensive benchmarking, we find that Support Vector Machines with extended-connectivity fingerprints consistently outperform more complex models, underscoring the importance of data quality and validation over algorithmic complexity. Here, conformal prediction provides rigorous activity estimation, while counterfactual analysis rationalizes structural features governing isoform selectivity, together enabling interpretable guidance for inhibitor design. To further test our model capability, we examine it on SLC-0111, as a selective inhibitor, which leads to a compatible result with the experiment. Our model reiterates experimental findings that modifications in the tail region strongly affect molecular selectivity, emphasizing the tail group as a key structural determinant for differentiating inhibitor activity among hCA isoforms II, IX, and XII. To facilitate adoption, we also release CAInsight, a user-friendly software with a graphical interface for virtual screening and generative design of a selective hCA inhibition.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。