Abstract
Diabetic retinopathy (DR) is a leading cause of vision impairment among individuals with type 2 diabetes mellitus (T2DM). This study aimed to construct and evaluate predictive models for DR by integrating clinical data with optical coherence tomography (OCT) parameters, using both Least Absolute Shrinkage and Selection Operator (LASSO) regression and random forest (RF) algorithms. A retrospective analysis was conducted on medical records of T2DM patients admitted between September 2020 and December 2023. After applying inclusion and exclusion criteria, 10,054 cases were selected. Patients were randomly assigned to training (70%) and validation (30%) cohorts. LASSO regression was used for variable selection, followed by logistic modeling. A RF model was also developed using the same features. Model performance was assessed using receiver operating characteristic curves, and differences were analyzed via the DeLong test. Key predictors identified included gender, insulin therapy, duration of diabetes, urinary albumin-to-creatinine ratio, and retinal vessel density. The RF model demonstrated superior performance with an areas under the curve of 0.89, compared to 0.79 for the LASSO model (P < .05). Retinal vessel density was consistently a protective factor, while prolonged diabetes duration and elevated albumin-to-creatinine ratios were associated with increased DR risk. OCT-derived retinal metrics, particularly vessel density, enhance the predictive capability of DR risk models. Among the 2 approaches, the RF model exhibited better classification performance and may serve as a practical tool for early screening and individualized risk assessment in clinical settings.