Abstract
BACKGROUND: Diabetic retinopathy (DR) remains a leading cause of blindness among working-age adults, yet scalable risk stratification tools tailored to primary care are lacking-particularly in underserved settings where specialized examinations are unavailable. We aimed to develop and externally validate a pragmatic, web-based nomogram for DR risk prediction using only routinely collected electronic health record (EHR) variables in community-dwelling individuals with type 2 diabetes (T2DM). METHODS: This retrospective cohort study analyzed EHR data from two independent Chinese populations. The primary cohort comprised 1,215 T2DM patients from 45 community health centers in Shenzhen, randomly split into training (n=851) and internal validation (n=364) sets. An external validation cohort of 329 patients was obtained from a center in Nanjing. Candidate predictors were screened via univariate analysis and least absolute shrinkage and selection operator (LASSO) regression within the training set. Selected variables were entered into multivariable logistic regression to construct a nomogram, which was deployed as an interactive web application. Model performance was assessed using the area under the receiver operating characteristic curve (AUC-ROC), calibration plots, decision curve analysis (DCA), and clinical impact curves (CIC). RESULTS: Three predictors-diabetes duration, HbA1c, and high body mass index (BMI ≥24 kg/m², Chinese standard)-were retained in the final model. The model demonstrated robust discrimination: AUC was 0.77 (95% CI: 0.73-0.81) in the training set, 0.79 (0.73-0.85) in internal validation, and 0.81 (0.75-0.87) in external validation. Calibration was adequate, with non-significant Hosmer-Lemeshow tests (P > 0.05) and Brier scores below 0.15 across all cohorts. DCA confirmed positive net benefit over a wide range of threshold probabilities (10-95%), and CIC revealed a 1:1 ratio between predicted and observed DR cases at risk thresholds above 40%. CONCLUSION: This three-parameter online nomogram provides a simple, readily implementable tool for DR risk stratification in primary care. Its robust external validation in an independent cohort and reliance on variables universally available in EHRs position it as a cost-effective solution to bridge the screening gap and enable timely specialist referral for high-risk T2DM patients.