Evaluation of a Machine Learning Model Based on Laboratory Parameters for the Prediction of Influenza A and B in Chongqing, China: Multicenter Model Development and Validation Study

基于实验室参数的机器学习模型在预测重庆市甲型和乙型流感中的应用:多中心模型开发与验证研究

阅读:1

Abstract

BACKGROUND: Influenza viruses are major pathogens responsible for acute respiratory infections in humans, which present with symptoms such as fever, cough, sore throat, muscle pain, and fatigue. While molecular diagnostics remain the gold standard, their limited accessibility in resource-poor settings underscores the need for rapid, cost-effective alternatives. Routine blood parameters offer promising predictive value but lack integration into intelligent diagnostic systems for influenza subtyping. OBJECTIVE: This study aimed to develop a machine learning model using 24 routine blood parameters to predict influenza A and B infections and validate a deployable diagnostic tool for low-resource clinical settings. METHODS: In this multicenter retrospective study, 6628 adult patients (internal cohort: n=2951; external validation: n=3677) diagnosed with influenza A virus infection (A+ group), influenza B virus infection (B+ group), or those presenting with influenza-like symptoms but testing negative for both viruses (A-/B- group) were enrolled from 3 hospitals between January 2023 and May 2024. The CatBoost (CATB) algorithm was optimized via 5-fold cross-validation and random grid search using 24 routine blood parameters. Model performance was evaluated using metrics such as the area under the curve (AUC), accuracy, specificity, sensitivity, positive predictive value, negative predictive value, and F(1)-score across internal testing and external validation cohorts, with Shapley additive explanations analysis identifying key biomarkers. The Artificial Intelligence Prediction of Influenza A and B (AI-Lab) tool was subsequently developed on the basis of the best-performing model. RESULTS: In the internal testing cohort, 7 models (K-nearest neighbors, naïve Bayes, decision tree, random forest, extreme gradient boosting, gradient-boosting decision tree, and CatBoost) were evaluated. The AUC values for diagnosing influenza A ranged from 0.788 to 0.923, and those for influenza B from 0.672 to 0.863. The CATB-based AI-Lab model achieved superior performance in influenza A detection (AUC 0.923, 95% CI 0.897-0.947) and influenza B (AUC 0.863, 95% CI 0.814-0.911), significantly outperforming conventional models (K-nearest neighbors, RF, and XGBoost; all P<.001). During external validation, AI-Lab demonstrated high performance, achieving an accuracy of 0.913 for differentiating the A+ group from the A-/B- group and 0.939 for distinguishing the B+ group from the A-/B- group. CONCLUSIONS: The CATB-based AI-Lab tool demonstrated high diagnostic accuracy for influenza A and B subtyping using routine laboratory data, achieving performance comparable to rapid antigen testing. By enabling timely subtype differentiation without specialized equipment, this system addresses critical gaps in managing influenza outbreaks, particularly in resource-constrained regions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。