Intergrader Agreement on Qualitative and Quantitative Assessment of Diabetic Retinopathy Severity Using Ultra-Widefield Imaging: INSPIRED Study Report 1

使用超广角成像技术对糖尿病视网膜病变严重程度进行定性和定量评估的评分者间一致性：INSPIRED 研究报告 1

阅读：1

作者：Riotto,Eleonora,Tsai,Wei-Shan,Khalid,Hagar,Lamanna,Francesca,Roch,Louise,Manoj,Medha,Sivaprasad,Sobha

期刊：	Diagnostics	影响因子：	3.300
时间：	2025	起止号：	2025 Jul 21;15(14)
doi：	10.3390/diagnostics15141831	靶点：	RED
研究方向：	代谢、神经科学	疾病类型：	视网膜病变、糖尿病、糖尿病视网膜病变

Abstract

Background/Objectives: Discrepancies in diabetic retinopathy (DR) grading are well-documented, with retinal non-perfusion (RNP) quantification posing greater challenges. This study assessed intergrader agreement in DR evaluation, focusing on qualitative severity grading and quantitative RNP measurement. We aimed to improve agreement through structured consensus meetings. Methods: A retrospective analysis of 100 comparisons from 50 eyes (36 patients) was conducted. Two paired medical retina fellows graded ultra-widefield color fundus photographs (CFP) and fundus fluorescein angiography (FFA) images. CFP assessments included DR severity using the International Clinical Diabetic Retinopathy (ICDR) grading system, DR Severity Scale (DRSS), and predominantly peripheral lesions (PPL). FFA-based RNP was defined as capillary loss with grayscale matching the foveal avascular zone. Weekly adjudication by a senior specialist resolved discrepancies. Intergrader agreement was evaluated using Cohen's kappa (qualitative DRSS) and intraclass correlation coefficients (ICC) (quantitative RNP). Bland-Altman analysis assessed bias and variability. Results: After eight consensus meetings, CFP grading agreement improved to excellent: kappa = 91% (ICDR DR severity), 89% (DRSS), and 89% (PPL). FFA-based PPL agreement reached 100%. For RNP, the non-perfusion index (NPI) showed moderate overall ICC (0.49), with regional ICCs ranging from 0.40 to 0.57 (highest in the nasal region, ICC = 0.57). Bland-Altman analysis revealed a mean NPI difference of 0.12 (limits: -0.11 to 0.35), indicating acceptable variability despite outliers. Conclusions: Structured consensus training achieved excellent intergrader agreement for DR severity and PPL grading, supporting the clinical reliability of ultra-widefield imaging. However, RNP measurement variability underscores the need for standardized protocols and automated tools to enhance reproducibility. This process is critical for developing robust AI-based screening systems.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。