NLP-ROPCare: predicting retinopathy of prematurity with admission notes using natural language processing

NLP-ROPCare：利用自然语言处理技术，根据入院记录预测早产儿视网膜病变

阅读：1

作者：Zhang,Yulin,Zhao,Shuai,Ren,Jianbing,Li,Yuwen,Zhao,Xinyu,Sun,Jie,Nie,Chuan,Xie,Suzhen,Huang,Xuelin,Wen,Jinming,Luo,Xianqiong,Zhang,Guoming

期刊：	BMJ Open Ophthalmology	影响因子：	2.200
时间：	2026	起止号：	2026 Jan 16;11(1)
doi：	10.1136/bmjophth-2025-002385	研究方向：	神经科学
疾病类型：	视网膜病变

Abstract

OBJECTIVES: Retinopathy of prematurity (ROP) is a leading cause of blindness in children worldwide, requiring more efficient models to help predict treatment-requiring ROP. Our study aimed to develop a new prediction model for ROP occurrence and severity, named NLP-ROPCare, using natural language processing (NLP). METHODS AND ANALYSIS: A retrospective observational study. Infants with a gestational age ≤32 weeks or birth weight ≤2000 g were collected in Guangdong Women and Children Hospital from 2013 to 2022, including 3922 preterm infants with 1106 patients with ROP. Four pretrained language models - BERT (Bidirectional Encoder Representations from Transformers), RoBERTa (Robustly Optimized BERT pretraining Approach), MC-BERT (language pre-training via a Meta Controller) and NEZHA (NEural contextualiZed representation for CHinese lAnguage understanding) - were used for development of NLP prediction models based on free-form texts in the admission notes. For comparison, two machine learning methods (Random Forest and Support Vector Machine) were used to construct prediction models based on 20 structured characteristics previously extracted from the admission notes. Performance evaluating metrics included accuracy, precision, recall, F1 score and area under the curve (AUC). RESULTS: The NLP prediction models for ROP occurrence outperformed those for severity. The NEZHA model demonstrated the highest accuracy in predicting ROP occurrence, achieving an F1 score of 89.35% and an AUC of 0.90. Its performance was also better than two machine learning models whose highest F1 was 78% with an AUC equal to 0.87. In addition, the F1 score of RoBERTa (78.44%) was slightly higher than that of NEZHA (77.81%) for predicting ROP severity, and the AUC of RoBERTa also achieved the highest 0.91. CONCLUSION: The NLP-ROPCare combines language models NEZHA and RoBERTa to enable early prediction of ROP occurrence and severity based on unstructured free-form texts in the admission notes of preterm infants, highlighting its value in early prevention of ROP. Further external validation should be carried out to better adjust the model.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。