From clinical phenotypes to genomic signatures: machine learning integration for precision tuberculosis treatment prediction

从临床表型到基因组特征:机器学习在精准结核病治疗预测中的应用

阅读:1

Abstract

BACKGROUND: Tuberculosis (TB) remains a major global health threat, causing approximately 1.5 million deaths each year. Despite progress in treatment, 15%-20% of patients still experience treatment failure or relapse, highlighting the urgent need for precise predictive tools for early identification of high-risk patients. Current methods based on clinical parameters have limitations in prediction accuracy and revealing potential biological mechanisms. METHODS: This study developed and validated an innovative multi-omics integration prediction model. We retrospectively collected clinical data from 467 tuberculosis patients and integrated transcriptomic data from three independent public cohorts (GSE19491, GSE31312, GSE83456), involving 3,240 differentially expressed genes. Through advanced feature engineering and bioinformatics analysis, key features were selected. We systematically evaluated 12 machine learning algorithms and adopted an ensemble learning strategy to construct the final model. Model performance was evaluated through strict cross-validation and prospective validation cohorts. RESULTS: Clinical data analysis identified age, body mass index (BMI), and C-reactive protein (CRP) levels as significant predictors of treatment response. Transcriptomic analysis revealed 1,247 differentially expressed genes between responders and non-responders, enriched in immune response and metabolic pathways. Among the tested algorithms, the ensemble model based on Extra Trees performed the best, with an area under the curve (AUC) of 0.986, significantly superior to models using only clinical data (AUC = 0.850) or only genomic data (AUC = 0.820). Feature importance analysis confirmed CRP, specific gene features (such as DNA repair and interferon response pathways), age, and BMI as the most important predictors. External validation confirmed the model's robustness (AUC = 0.972). CONCLUSION: This study successfully developed a high-precision prediction model integrating clinical and genomics data, capable of early identification of high-risk patients with poor treatment response. The model demonstrates excellent prediction performance and generalization ability, providing a powerful tool for moving towards tuberculosis precision medicine, guiding individualized treatment strategies to improve patient prognosis and control the spread of drug resistance. CLINICAL TRIAL REGISTRATION: https://www.chictr.org.cn/, ChiCTR2300074328, 03/08/2023.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。