Abstract
Lower respiratory tract infections (LRTI) are a leading cause of mortality and are challenging to diagnose in critically ill patients, as non-infectious causes of respiratory failure can present with similar clinical features. We develop an LRTI diagnostic method combining the pulmonary transcriptomic biomarker FABP4 with electronic medical record text assessment using the large language model Generative Pre-trained Transformer 4. In a cohort of critically ill adults, a combined classifier incorporating FABP4 expression and large language model electronic medical record analysis achieves an area under the receiver operating characteristic curve (AUC) of 0.93 ± 0.08 and an accuracy of 84%, outperforming FABP4 expression alone (0.84 ± 0.11) and large language model-based analysis alone (0.83 ± 0.07). By comparison, the medical team admission diagnosis has an accuracy of 72%. In an independent validation cohort, the combined classifier yields an AUC of 0.98 ± 0.04 and accuracy of 96%. This study suggests that integrating a host biomarker with large language model analysis can improve LRTI diagnosis in critically ill adults.