Abstract
PURPOSE: To develop nonimaging machine learning models using clinical data from the first screening to predict the occurrence of retinopathy of prematurity (ROP). DESIGN: This multicenter regional study was conducted in Yamagata Prefecture, Japan. PARTICIPANTS: We collected clinical data of neonates born between October 2016 and September 2018 and screened in 4 neonatal care units. METHODS: The 35 variables available at the first screening were used as possible predictors to develop a decision tree, a random forest, a gradient-boosted tree, a neural network, and a Naive Bayes model. Parameter tuning was performed using a 10-fold cross-validation. This process was repeated 200 times using different random seeds for data partitioning. MAIN OUTCOME MEASURES: The target outcome was the final ROP outcome (i.e., the development of any stage of ROP during hospitalization). RESULTS: Of the 215 neonates screened, 43 (20.0%) developed ROP. The median gestational age was 31.4 (interquartile range: 28.1-33.4) weeks, and the median birth weight was 1502 (interquartile range: 967-1823) g. The mean 200-iteration area under the receiver operating characteristic curve (AUC-ROC), accuracy, sensitivity, and specificity of the random forest model were 0.93 (95% confidence interval [CI] 0.83-0.99), 90.1% (95% CI 84.1-95.2), 95.7% (95% CI 88.2-100), and 66.0% (95% CI 41.7-91.7), respectively. The mean 200-iteration AUC-ROC, accuracy, sensitivity, and specificity of the Naive Bayes model were 0.94 (95% CI 0.86-0.99), 90.6% (95% CI 84.1-96.8), 94.6% (95% CI 86.3-100), and 73.6% (95% CI 50.0-91.7), respectively. CONCLUSIONS: Nonimaging machine learning methods have shown high performance in predicting the occurrence of ROP. These models can be beneficial when a fundus camera cannot capture images due to eye opacity and for hospitals that lack pediatric fundus cameras. FINANCIAL DISCLOSURES: The author(s) have no proprietary or commercial interest in any materials discussed in this article.