Abstract
Clustering high-dimensional biomedical data without prior knowledge of the number of clusters remains a major challenge in medical image and signal analysis. We present SONSC (Separation-Optimized Number of Smart Clusters), an adaptive and interpretable clustering framework driven by the Improved Separation Index (ISI)-a novel internal validity metric that jointly evaluates intra-cluster compactness and inter-cluster separability. SONSC iteratively maximizes ISI across candidate cluster configurations to automatically infer the optimal number of clusters, without supervision or parameter tuning. Extensive experiments on benchmark datasets (MNIST, CIFAR-10) and real-world clinical modalities (chest X-ray, ECG, RNA-seq) demonstrate that SONSC consistently outperforms classical methods such as K-Means, DBSCAN, and spectral clustering in ISI, Silhouette score, and normalized mutual information (NMI). Beyond numerical performance, SONSC identifies clinically coherent structures aligned with expert-labeled categories, supporting its integration into diagnostic and patient stratification pipelines. By unifying algorithmic robustness with medical interpretability, SONSC provides a scalable and trustworthy solution for unsupervised biomedical data analysis.