Identification of routine blood derived hematological and lipid indices in ILD through machine learning; a retrospective case-control study

利用机器学习识别ILD患者常规血液学和血脂指标:一项回顾性病例对照研究

阅读:1

Abstract

INTRODUCTION: Interstitial lung disease (ILD) comprises various disorders marked by pulmonary inflammation and fibrosis. Early diagnosis and risk prediction are vital for improving patient outcomes. METHODS: We retrospectively analyzed 603 patients who had visited the Hubin Campus between January 2022 and April 2025, employing a 1:2 case-control design with age- and gender-matched groups. We collected clinical information, complete blood count data, lipid metabolism indicators, and various derived indices. CONCLUSION: Six key markers were identified through three machine learning algorithms (LassoCV, SVMREFCV, and Boruta): neutrophil percentage, lymphocyte percentage, monocyte percentage, hemoglobin, and two novel ratios - neutrophil-to-HDL-C and lymphocyte-to-HDL-C. The random forest model outperformed seven other machine learning approaches, with AUC values of 0.868 (validation set), 0.885 (test set), and 0.849 (external cohort), demonstrating consistent predictive accuracy. DISCUSSION: Based on these findings, we developed an online prediction tool to assist primary care clinicians in assessing the risk of ILD in suspected cases. Our results indicate that the random forest model exhibits high accuracy and clinical utility for early ILD prediction, providing a novel tool and methodology for early diagnosis and intervention. Future studies will focus on further optimizing the model and validating it in larger multicenter cohorts.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。