Protein Representation in Metric Spaces for Protein Druggability Prediction: A Case Study on Aspirin

基于度量空间的蛋白质表示法预测蛋白质成药性:以阿司匹林为例

阅读:1

Abstract

Background: Accurately predicting protein druggability is crucial for successful drug development, as it significantly reduces the time and resources required to identify viable drug targets. However, existing methods often face trade-offs between accuracy, efficiency, and interpretability. This study aims to introduce a lightweight framework designed to address these challenges effectively. Methods: We present a lightweight framework that embeds proteins into four biologically informed, non-Euclidean metric spaces, derived from analyses of amino acid sequences, predicted secondary structures, and curated post-translational modification (PTM) annotations. These representations capture key features such as hydrophobicity profiles, PTM densities, spatial patterns, and secondary structure composition, providing interpretable proxies for structure-related determinants of druggability. This approach enhances our understanding of protein functionality while improving druggability predictability in a biologically relevant context. Results: Evaluated on an Aspirin-binding protein dataset using leave-one-out cross-validation (LOOCV), our distance-based ensemble achieves 92.25% accuracy (AUC = 0.9358) in the whole-protein setting. This performance significantly outperforms common sequence-only baselines in the literature while remaining computationally efficient. Conclusions: On a refined single-chain subset, our framework demonstrates performance comparable to established feature engineering pipelines, highlighting its potential effectiveness in practical applications. Together, these results strongly suggest that biologically grounded, non-Euclidean embeddings provide an effective and interpretable alternative to resource-intensive 3D pipelines for target assessment in drug discovery. This approach not only enhances our ability to assess protein druggability but also streamlines the overall process of target identification and validation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。