Interpretable machine learning-driven QSAR modeling for coagulation factor X inhibitors: from molecular descriptors to predictive potency

基于机器学习的凝血因子X抑制剂QSAR建模:从分子描述符到预测效力

阅读:2

Abstract

Inhibition of Coagulation Factor X (FXa) is a clinically validated therapeutic strategy; however, developing safer and more selective inhibitors remains a major challenge. In this study, we developed an interpretable machine learning-based QSAR framework to predict both the inhibitory potency and activity class of small molecules targeting FXa. A structurally curated dataset of 6400 compounds was retrieved from ChEMBL, standardized, and encoded using 391 non-redundant Mordred descriptors following systematic filtering. Benchmarking of 42 regression and 42 classification algorithms identified ExtraTreesRegressor and XGBoostClassifier as the most robust models. The regression model achieved an R(2) of 0.760 and an RMSE of 0.831 on the independent test set, while the classification model reached an accuracy of 0.91 with balanced precision, recall, and an ROC-AUC of 0.962. SHAP (SHapley Additive exPlanations) analysis further enhanced interpretability by revealing that electrostatic, topological, and polar surface descriptors were the dominant contributors to FXa inhibitory potency. Applicability domain assessment using Williams plots confirmed that most compounds in both the training and test sets lay within the model's reliable prediction space. Overall, the proposed QSAR pipeline integrates strong predictive performance with valuable mechanistic interpretability and rigorous validation, offering a practical computational tool for the virtual screening and rational design of novel FXa inhibitors.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。