Abstract
Background: Diabetes mellitus is a heterogeneous metabolic disorder that poses substantial challenges in the management of patients with diabetes. Emerging research underscores the potential of unsupervised cluster analysis as a promising methodological approach for unraveling the complex heterogeneity of diabetes mellitus. This systematic review evaluated the effectiveness of unsupervised cluster analysis in identifying diabetes phenotypes, elucidating the risks of diabetes-related complications, and distinguishing treatment responses. Methods: We searched MEDLINE Complete, PubMed, and Web of Science and reviewed forty-one relevant studies. Additionally, we conducted a cross-sectional study using K-means cluster analysis of real-world clinical data from 558 patients with diabetes. Results: A key finding was the consistent reproducibility of the five clusters across diverse populations, encompassing various patient origins and ethnic backgrounds. MOD and MARD were the most prevalent clusters, while SAID was the least prevalent. Subgroup analysis stratified by ethnic group indicated a higher prevalence of SIDD among individuals of Asian descent than among other ethnic groups. These clusters shared similar phenotypic traits and risk profiles for complications, with some variations in their distribution and key clinical variables. Notably, the SIRD subtype was associated with a wide spectrum of kidney-related clinical presentations. Alternative clustering techniques may reveal additional clinically relevant diabetes subtypes. Our cross-sectional study identified five subgroups, each with distinct profiles of glycemic control, lipid metabolism, blood pressure, and renal function. Conclusions: Overall, the results suggest that unsupervised cluster analysis holds promise for revealing clinically meaningful subgroups with distinct characteristics, complication risks, and treatment responses that may remain undetected using conventional approaches.