Machine Learning Techniques for Computational Characterization of Protein-Associated Small Molecules with Therapeutic Relevance

机器学习技术在具有治疗相关性的蛋白质相关小分子计算表征中的应用

阅读:1

Abstract

This study presents a computational QSPR framework for the analysis of protein-associated small molecular graphs by using machine learning under limited sample conditions. Protein related molecules were modeled as molecular graphs, and their structural characteristics were quantified using topological indices, physicochemical properties, and molecular complexity descriptors. To address the high dimensionality relative to the sample size (n = 20), univariate feature selection was employed to retain the most informative descriptors, and model performance was rigorously evaluated using Leave-One-Out Cross-Validation (LOOCV). An Artificial Neural Network (ANN) optimized with the L-BFGS algorithm was benchmarked against Support Vector Regression (SVR) to assess predictive robustness. The results demonstrate excellent predictive accuracy for mass related properties such as molecular weight, while revealing inherent limitations of topological descriptors for complex physicochemical properties including isoelectric point and hydrophobicity. The reporting of cross-validation uncertainty confirms the model stability and mitigates overfitting concerns, establishing the proposed framework as a reliable and interpretable approach for small-sample QSPR modeling with therapeutic relevance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。