Detection of LUAD-Associated Genes Using Wasserstein Distance in Multiomics Feature Selection

利用Wasserstein距离在多组学特征选择中检测LUAD相关基因

阅读:2

Abstract

Lung adenocarcinoma (LUAD) is characterized by substantial genetic heterogeneity, making it challenging to identify reliable biomarkers for diagnosis and treatment. Tumor mutational burden (TMB) is widely recognized as a predictive biomarker due to its association with immune response and treatment efficacy. In this study, we take a different approach by treating TMB as a response variable to uncover its genetic drivers using multiomics data. We conducted a thorough evaluation of recent feature selection methods through extensive simulations and identified three top-performing approaches: projection correlation screening (PC-Screen), distance correlation sure independence screening (DC-SIS), and Wasserstein distance-based screening (WD-Screen). Unlike traditional approaches that rely on simple statistical tests or dataset splitting for validation, we adopt a method-based validation strategy, selecting top-ranked features from each method and identifying consistently selected genes across all three. Using The Cancer Genome Atlas (TCGA) dataset, we integrated copy number alteration (CNA), mRNA expression, and DNA methylation data as predictors and applied our selected methods. In the two-platform analysis (mRNA + CNA), we identified 13 key genes, including both previously reported LUAD-associated genes (CCNG1, CKAP2L, HSD17B4, SHROOM1, TIGD6, and TMEM173) and novel candidates (DTWD2, FLJ33630, NME5, NUDT12, PCBD2, REEP5, and SLC22A5). Expanding to a three-platform analysis (mRNA + CNA + methylation) further refined our findings, with PCBD2 and TMEM173 emerging as the robust candidates. These results highlight the complexity of multiomics integration and the need for advanced feature selection techniques to uncover biologically meaningful patterns. Our multiomics strategy and robust selection approach provide insights into the genetic determinants of TMB, offering potential biomarkers for targeted LUAD therapies and demonstrating the power of Wasserstein distance-based feature selection in complex genomic analysis.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。