Abstract
OBJECTIVE: Immune checkpoint inhibitor-related pneumonitis (ICIP) is a common and potentially life-threatening adverse event with non-specific symptoms. It is of significance to identify high-risk population of ICIP. However, existing prediction models for ICIP are often limited by their reliance on clinically inaccessible variables and homogeneous methodologies, hindering their clinical utility. This study aimed to develop a clinical risk-prediction model for ICIP in patients with gastrointestinal (GI) cancer based on four machine learning (ML) methods. METHODS: We conducted a retrospective analysis of data from GI cancer patients who received immune checkpoint inhibitors (ICIs) between 2018 and 2022 in Beijing Cancer Hospital. For each patient, 36 clinical indicators associated with pneumonia risk were gathered. The dataset was split into training and testing sets in a ratio of 7:3. Variable selection was first performed using Least Absolute Shrinkage and Selection Operator (LASSO) regression. Subsequently, four ML algorithms: logistic regression (LR), random forest (RF), Support vector machine (SVM), and Adaptive Boosting (AdaBoost), were employed to develop and validate ICIP prediction models. The models' performance was assessed using sensitivity, specificity, precision, F1-score, and the area under the receiver operating characteristic curve (AUC) value. The optimal cutoff point for the best model was determined and a web-based tool was developed based on it. RESULTS: We collected medical data from 1,101 GI cancer patients. Ten predictive variables were identified as significant: gender, age, treatment line, smoking index, drinking history, lung metastasis, neutrophil-to-lymphocyte ratio, platelet-to-lymphocyte ratio, hemoglobin, and albumin. After constructing and comparing four ML models, the RF model demonstrated best performance with an AUC of 0.899. The web-based tool for ICIP risk prediction is available at https://healthy.aistarfish.com/business/pneumonia-prediction/#/home. CONCLUSIONS: We analyzed 36 clinical predictors of ICIP in 1,101 patients treated with ICIs, and 10 variables were included. The smoking index, albumin and hemoglobin emerged as novel predictors specific to GI cancers. Among the models constructed using four ML methods, the RF model showed the best performance. Additionally, a web-based tool was developed to facilitate the early clinical identification of populations at high risk of ICIP. Future directions include external validation of the model to enhance clinical usability.