Abstract
Systemic chemotherapy is the cornerstone for treating patients with locally advanced non-small-cell lung cancer (NSCLC). Various adverse effects (AEs) are caused by anticancer therapy, limiting the efficacy of chemotherapy. The precise prediction and early detection of AEs could result in improved efficacy of chemotherapy and quality of life. In the present study, machine learning (ML) algorithms, including random forest (RF), multilayer perceptron and AdaBoost, were employed to develop prediction models for common AEs using dynamic treatment information. A total of 1,659 chemotherapeutic information data points for 403 patients with NSCLC who underwent chemotherapy were extracted from an electronic health record system. A five-fold cross-validation was performed, and the received operating characteristic (ROC) curve and calibration curve were used to evaluate the model performance. Patients with multi-AEs had worse therapeutic efficacy of neoadjuvant chemotherapy (P<0.001; Fisher's exact test) and worse prognosis (P<0.05; log-rank test) compared with patients without multi-AEs. The area under ROC curve values of the RF model were 0.75, 0.74 and 0.76 for predicting myelosuppression, low albumin and hepatic impairment, respectively, and its calibration curve was found linear in the calibration range with regression factor r(2)≥0.99. The RF model outperformed the other models. A marked performance improvement was observed when <10 selected features were used and feature importance was ranked by Shapley Additive Explanation values. In conclusion, the occurrence of multi-AEs limits the efficacy of chemotherapy and negatively affects the outcomes of patients with lung cancer. ML-based prediction models of chemotherapy-associated AEs may be a breakthrough for improving the prognosis of patients receiving lung cancer chemotherapy.