Abstract
Immune checkpoint inhibitors (ICIs), either as monotherapy (ICI-Mono) or combined with chemotherapy (ICI-Chemo), improves survival in advanced non-small cell lung cancer (NSCLC). However, prospective guidance for choosing between these options remains limited, and single-feature biomarkers like PD-L1 prove inadequate. We develop a machine learning model using clinicogenomic data from four cohorts (MD Anderson n = 750; Mayo Clinic n = 80; Dana-Farber n = 1077; Stand Up To Cancer n = 393) to predict individual benefit from adding chemotherapy. Benefit scores are calculated using five distinct functions derived from 28 genomic and 6 clinical features. Our integrated model, A-STEP (Attention-based Scoring for Treatment Effect Prediction), estimates heterogeneous treatment effects and achieves the largest reduction in 3-month progression risk, improving weighted risk reduction by 13-23% over stand-alone models. A-STEP recommends treatment changes for over 50% of patients, most often favoring ICI-Chemo. In simulation on external cohort, patients treated in accordance with A-STEP recommendations show improved 2-year progression-free survival (HR = 0.60 for ICI-Mono treatment arm; HR = 0.58 for ICI-Chemo treatment arm). Predictive features include FBXW7, APC, and PD-L1. In this study, we demonstrate how machine learning can fill critical gaps in immunotherapy selection for NSCLC, by modeling treatment heterogeneity with real-world clinicogenomic data, driving precision medicine beyond conventional biomarker boundaries.