Abstract
BACKGROUND/OBJECTIVES: Immunotherapy is a viable therapeutic approach for non-small cell lung cancer (NSCLC). Despite the significant survival benefit of immune checkpoint inhibitors PD-1/PD-L1, on average; the objective response rate is around 20% as monotherapy and around 50% in combination with chemotherapy. While PD-L1 IHC is used as a predictive biomarker, its accuracy is subpar. METHODS: In this work, we develop a machine learning (ML) method to predict response to immunotherapy in NSCLC from multimodal clinicopathological biomarkers, tumor and peritumoral radiomic biomarkers from CT images. We further learn a graph structure to understand the associations between biomarkers and treatment response. The graph is then used to create sentences with clinical hypotheses that are finally used in a Large Language Model (LLM) that explains the treatment response predicated on the biomarkers that are comprehensible to clinicians. From a retrospective study, a training dataset of NSCLC with n = 248 tumors from 140 subjects was used for feature selection, ML model training, learning the graph structure, and fine-tuning LLM. RESULTS: An AUC = 0.83 was achieved for prediction of treatment response on a separate test dataset of n = 84 tumors from 47 subjects. CONCLUSIONS: Our study therefore not only improves the prediction of immunotherapy response in patients with NSCLC from multimodal data but also assists the clinicians in making clinically interpretable predictions by providing language-based explanations.