Background
Non-small-cell lung cancer (NSCLC) remains a global health challenge, driving morbidity and mortality. The emerging field of radiogenomics utilizes statistical
Conclusion
The successful integration of heterogeneous radiogenomic datasets underscores the potential of imaging biomarkers in uncovering NSCLC biological processes through gene expression profiles.
Methods
In a retrospective study of two NSCLC patient cohorts separated by 5 years, we performed a radiogenomic analysis of previously disseminated data from 2018 (n = 116) and newly acquired data from 2023 (n = 44) using RNA sequencing and lung CT images. Combining the data from two cohorts post binarization (of gene expression) or batch normalization (of radiomic features) in each cohort proved to be a better approach as compared to training the model on one cohort and validating on the other.
Results
Our ML-based radiogenomic modeling identified specific imaging features-wavelet, three-dimensional local binary patterns, and logarithmic sigma of gray-level variance-as predictive indicators for high (1) vs. low (0) gene expression of pivotal NSCLC-related genes: SLC35C1, BCL2L1, and MAPK1. These genes have recognized implications in a variety of biological pathways and mechanisms of drug resistance pertinent to NSCLC.
