A machine learning-based diagnostic model for myocardial infarction patients: Analysis of neutrophil extracellular traps-related genes and eQTL Mendelian randomization

基于机器学习的心肌梗死患者诊断模型:中性粒细胞胞外陷阱相关基因和eQTL孟德尔随机化分析

阅读:1

Abstract

To identify neutrophil extracellular trap (NET)-associated gene features in the blood of patients with myocardial infarction (MI) using bioinformatics and machine learning, with the aim of exploring potential diagnostic utility in atherosclerosis. The datasets GSE66360 and GSE48060 were downloaded from the Gene Expression Omnibus (GEO) public database. GSE66360 was used as the training set, and GSE48060 was used as an independent validation set. Differential genes related to NETs were screened using R software. Machine learning was performed based on the differential expression of NET-related genes across different samples. The advantages and disadvantages of 4 machine learning algorithms (Random Forest [RF], Extreme Gradient Boosting [XGBoost, XGB], Generalized Linear Models [GLM], and Support Vector Machine-Recursive Feature Elimination [SVM-RFE]) were compared, and the optimal method was used to screen feature genes and construct diagnostic models, which were then validated in the external validation dataset. Correlations between feature genes and immune cells were analyzed, and samples were reclustered based on the expression of feature genes. Differences in downstream molecular mechanisms and immune responses were explored for different clusters. Weighted Gene Co-expression Network Analysis was performed on different clusters, and disease-related NET genes were extracted, followed by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analysis. Finally, Mendelian randomization was employed to further investigate the causal relationship between the expression of model genes and the occurrence of MI. Forty-seven NET-related differential genes were obtained, and after comparing the 4 machine learning methods, support vector machine was used to screen ATG7, MMP9, interleukin 6 (IL6), DNASE1, and PDE4B as key genes for the construction of diagnostic models. The diagnostic value of the model was validated in an independent external validation dataset. These five genes showed strong correlations with neutrophils. Different sample clusters also demonstrated differential enrichment in pathways such as nitrogen metabolism, complement and coagulation cascades, cytokine-cytokine receptor interaction, renin-angiotensin system, and steroid biosynthesis. The Mendelian randomization results demonstrate a causal relationship between the expression of ATG7 and the incidence of myocardial infarction. The feature genes ATG7, MMP9, IL6, DNASE1, and PDE4B, identified using bioinformatics, may serve as potential diagnostic biomarkers and therapeutic targets for Myocardial infarction. Specifically, the expression of ATG7 could potentially be a significant factor in the occurrence of MI.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。