Abstract
PURPOSE: This study aims to improve survival modeling in head and neck cancer (HNC) by integrating patient-reported outcomes (PROs) using dimensionality reduction techniques. PROs capture symptom severity across the treatment timeline and offer key insights for personalized care. However, their high dimensionality poses challenges such as overfitting and computational complexity. This work focuses on transforming and incorporating PRO data to enhance model performance in HNC. MATERIALS AND METHODS: We analyzed retrospective data of 923 patients with HNC treated at the University of Texas MD Anderson Cancer Center between 2010 and 2021. Baseline clinical data including demographic, treatment, and disease characteristics were used to build a reference survival model. PRO data, capturing symptom ratings, were integrated using dimensionality reduction techniques: principal component analysis (PCA), autoencoders (AEs), and patient clustering. These reduced representations, combined with clinical data, were input into Cox proportional hazards models to predict overall survival (OS) and progression-free survival (PFS). Model performance was assessed using the concordance index, time-dependent AUC, Brier score for calibration, and hazard ratios for predictor significance. RESULTS: Cox models incorporating PCA and AE outperformed the clinical-only reference model for both OS and PFS. The PCA-based model achieved the highest C-indices (0.74 for OS and 0.64 for PFS), followed by the AE model (0.73 and 0.63) and the clustering model (0.72 and 0.62). Time-dependent AUCs reinforced these results, with PCA showing the highest average AUC over 36 months. All models were well-calibrated, with low Brier scores. Key predictors included age, disease stage, and tumor subsite. CONCLUSION: Dimensionality reduction techniques improve survival prediction in patients with HNC by effectively incorporating PRO data, potentially providing greater insights into more personalized treatment strategies.