Abstract
OBJECTIVE: To build a time-series machine learning (ML) model that improves bronchopulmonary dysplasia (BPD) prediction compared with published online calculators. STUDY DESIGN: We used a single-center, extremely low gestational age newborn cohort (inborn, birth year 2016-2021, n = 438). The primary outcome was a 5-level class outcome for BPD as defined by the Neonatal Research Network (NRN) in 2019. Flowsheet data were extracted from the electronic medical record. Time-series data were generated from birth onward, with 14 static and 35 dynamic input attributes. Iterative static (regression) and dynamic (ML) modeling was performed, comparing model performance with the NRN BPD calculator at several time points (postnatal day 1, 3, 7, 14, and 28) and ranking feature leverage at each time point. RESULTS: Of the original cohort, 92 infants met all inclusion criteria (gestational age 25.6 ± 1.4 weeks). Static models performed comparably with the NRN BPD calculator (area under the curve = 0.7460), improving to 0.7978 with forward/backward selection. In contrast, dynamic long short-term memory (LSTM) models outperformed static models at all time points, reaching a peak area under the curve of 0.8400 on postnatal day 28. LSTM models performed best for no BPD and severe disease/death. Principal component analysis revealed that respiratory support, ventilator settings, supplemental oxygen requirements, medications, and prenatal/postnatal growth were major factors driving BPD severity. CONCLUSIONS: LSTM-based ML time-series analysis substantially outperformed static approaches for predicting BPD and death among extremely low gestational age newborns. Integrating ML methods into clinical applications holds promise for enhancing real-time BPD trajectory mapping.