Abstract
BACKGROUND: Keratoconus (KC) is a progressive corneal disorder that leads to visual impairment owing to corneal thinning and deformation. While current diagnostic methods are effective in the later stages, early detection of KC remains challenging. This study aimed to investigate the potential of tear fluid biomarkers combined with machine learning (ML) models for the early diagnosis of KC. METHODS: A total of 370 participants were recruited, including 134 patients with KC, 93 with early KC, and 143 normal controls. Cytokine levels in tear fluid were measured using multiplex cytokine analysis. Clinical parameters were evaluated, and nine ML models were employed to predict early KC risk based on tear biomarkers and clinical data. RESULTS: Significant increases in IL-1β, Galectin-1, and Galectin-3 levels were observed in the KC and early KC groups compared to controls (p < 0.05 for all). The Random Forest model demonstrated the highest accuracy when integrating tear biomarker levels and clinical parameters. Key biomarkers, such as Galectin-3, IL-1β, and Galectin-1, in combination with clinical parameters (D-index, BE, and Kmax), significantly enhanced prediction accuracy. Furthermore, a web application was developed to facilitate clinical deployment, enabling personalized risk assessment for early KC. CONCLUSIONS: The study highlight the potential of integrating noninvasive tear fluid biomarkers, Pentacam parameters and ML techniques to improve the early detection of KC. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12886-025-04451-8.