Abstract
OBJECTIVE: Acute respiratory distress syndrome (ARDS) is a common complication in patients with non-pulmonary sepsis. Early identification and prediction of the occurrence of ARDS in non-pulmonary sepsis patients are of vital importance for timely intervention and improving the prognosis of these patients. MATERIALS AND METHODS: 482 patients were included in this study. The Recursive Feature Elimination (RFE) method was employed to identify the key variables related to the prognosis of sepsis. The selected variables were used to construct nine different machine learning prediction models. To evaluate the performance of the model, we employed the Receiver Operating Characteristic (ROC) Curve, calibration curve, and Decision Curve Analysis (DCA). The clinical significance of the model was further analyzed through Shapley Additive Explanations (SHAP) analysis. RESULTS: Through the RFE method, the final selected 11 variables. In the training set and test set, the AUC of the LightGBM model was 0.954 (95% CI: 0.933-0.973) and 0.923 (95% CI: 0.864-0.967) respectively. In this study, the calibration curve of the LightGBM model was close to the diagonal, indicating that its probability predictions were relatively reliable. In the DCA curves, the LightGBM model consistently maintained the highest net gain within the threshold range of 0-0.4, indicating LightGBM has greater clinical practical value. Through SHAP analysis, it was found that the SOFA score, PaO2/FiO2 ratio, lactate level, creatinine, and SAPS II score were the five most important features in the model prediction. CONCLUSION: In this study, a machine learning model based on inflammatory indicators and blood gas parameters was successfully developed and validated to predict the risk of ARDS in patients with non-pulmonary sepsis.