Abstract
Lung cancer is the leading cause of cancer mortality. To investigate genetic determinants for prognosis among patients diagnosed with early-stage non-small cell lung cancer (NSCLC), we conducted the first large-scale genome-wide association prognostic study using data from the International Lung Cancer Consortium (ILCCO) through a two-phase analysis. Phase 1 includes the discovery of genome-wide association studies analysis using a multivariable Cox PH model on 3428 NSCLC patients of European ancestry from 10 ILCCO participating studies to identify genetic variants associated with overall survival and validation analysis for genome-wide significant variants (P-value ≤5 × 10-8) using the Cancer Genome Atlas (TCGA). Phase 2 aims to identify causal variants using functional analyses of genome-wide significant and suggestive variants (P-value ≤1 × 10-5), including variant-epigenetic functional annotation (FAVOR), CHIP-seq data, variant-gene expression association, and colocalization analysis. We identified two significant variants; of those, a locus at 9q21.31 (rs117979484) was significant at the genome-wide level (P = 3.67 × 10-8) and validated in TCGA (P = 0.03). Three suggestive variants were found to have a putative epigenetic function: intronic variants rs149281784 (BCL7B gene) and rs148031766 (POM121 gene) both located at 7q11.23 and in moderate linkage disequilibrium with each other; and variant rs2471630 (SRCIN1 gene; 17q12). Specifically, variants rs149281784 and rs148031766 have potential regulatory roles in the transcriptional activation of the BCL7B gene and POM121 gene. Exploratory survival analyses in the squamous cell carcinomas subgroup also identified a significant variant, rs138467404 (GRHL-2 gene; 8q22.3) at a genome-wide level (P = 4.75 × 10-8) and validated by TCGA (P = 0.02). These new findings indicate potential novel pathways associated with early-stage NSCLC prognosis. Future research may validate additional genome-wide suggestive variants as being relevant for lung cancer outcomes.