Abstract
OBJECTIVE: In advanced-stage lung cancer management, the brain emerges as the predominant metastatic site. Accurately forecasting both the likelihood of intracranial metastasis and patient survival duration has become a critical clinical priority, though existing predictive systems demonstrate insufficient precision. MATERIALS AND METHOD: This study conducted a retrospective cohort study utilizing SEER database records of lung cancer cases spanning 2010‒2015, incorporating demographic and clinical data. Multivariate logistic regression was initially employed for variable selection, followed by Cox proportional hazards modeling to identify risk factors associated with brain metastasis development and prognostic indicators. Significant predictors were subsequently integrated into six machine learning algorithms to construct risk stratification frameworks. Model performance underwent comprehensive evaluation through area under the ROC curve metrics, classification error matrices, and calibration curve assessments. RESULTS: This investigation analyzed a cohort of 25,072 lung cancer patients. Multiple clinical parameters including demographic characteristics (age, gender, ethnicity), tumor features (primary location, histological classification, differentiation grade), disease progression markers (T/N staging), therapeutic interventions (surgical procedures, radiotherapy, chemotherapy), and social factors (marital status) emerged as significant predictors of prognosis. Metastasis risk analysis identified age, tumor grade, histopathological type, and treatment modalities as key determinants. Comparative evaluation of predictive models revealed that the Gradient Boosting Machine (GBM) algorithm demonstrated superior performance in both cross-validation and internal testing phases. CONCLUSION: The developed GBM-based predictive system offers clinicians a user-friendly web application for estimating cerebral metastasis probability in lung cancer cases, enhancing personalized treatment planning through accessible risk stratification.