Abstract
This study presents a computational QSPR framework for the analysis of protein-associated small molecular graphs by using machine learning under limited sample conditions. Protein related molecules were modeled as molecular graphs, and their structural characteristics were quantified using topological indices, physicochemical properties, and molecular complexity descriptors. To address the high dimensionality relative to the sample size (n = 20), univariate feature selection was employed to retain the most informative descriptors, and model performance was rigorously evaluated using Leave-One-Out Cross-Validation (LOOCV). An Artificial Neural Network (ANN) optimized with the L-BFGS algorithm was benchmarked against Support Vector Regression (SVR) to assess predictive robustness. The results demonstrate excellent predictive accuracy for mass related properties such as molecular weight, while revealing inherent limitations of topological descriptors for complex physicochemical properties including isoelectric point and hydrophobicity. The reporting of cross-validation uncertainty confirms the model stability and mitigates overfitting concerns, establishing the proposed framework as a reliable and interpretable approach for small-sample QSPR modeling with therapeutic relevance.