A machine learning model for predicting severe mycoplasma pneumoniae pneumonia in school-aged children

用于预测学龄儿童严重肺炎支原体肺炎的机器学习模型

阅读:1

Abstract

OBJECTIVE: To develop an interpretable machine learning (ML) model for predicting severe Mycoplasma pneumoniae pneumonia (SMPP) in order to provide reliable factors for predicting the clinical type of the disease. METHODS: We collected clinical data from 483 school-aged children with M. pneumoniae pneumonia (MPP) who were hospitalized at the Children's Hospital of Soochow University between September 2021 and June 2024. Difference analysis and univariate logistic regression were employed to identify predictors for training features in ML. Eight ML algorithms were used to build models based on the selected features, and their effectiveness was validated. The area under the curve (AUC), accuracy, five-fold cross-validation, and decision curve analysis (DCA) were utilized to evaluate model performance. Finally, the best-performing ML model was selected, and the Shapley Additive Explanations (SHAP) method was applied to rank the importance of clinical features and interpret the final model. RESULTS: After feature selection, 30 variables remained. We constructed eight ML models and assessed their effectiveness, finding that the CatBoost model exhibited the best predictive performance, with an AUC of 0.934 and an accuracy of 0.9175. DCA was used to compare the clinical benefits of the models, revealing that the CatBoost model provided greater net benefits than the other ML models within the threshold probability range of 34% to 75%. Additionally, we applied the SHAP method to interpret the CatBoost model, and the SHAP diagram was used to visually show the influence of predictor variables on the outcome. The results identified the top six risk factors as the number of days with fever, D-dimer, platelet count (PLT), C-reactive protein (CRP), lactate dehydrogenase (LDH), and the neutrophil-to-lymphocyte ratio (NLR). CONCLUSIONS: The interpretable CatBoost model can help physicians accurately identify school-aged children with SMPP. This early identification facilitates better treatment options and timely prevention of complications. Furthermore, the SHAP algorithm enhances the model's transparency and increases its trustworthiness in practical applications.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。