Abstract
This study aims to investigate the relationship between the metabolic score for visceral fat (METS-VF) and the risk of asthma incidence, and to assess its role in predictive models using machine learning methods. Data from the National Health and Nutrition Examination Survey database from 2001 to 2018 were used, with a total of 13,695 participants included after excluding those with missing values. The Boruta algorithm was employed to screen variables, which were then randomly divided into training and validation sets at a 7:3 ratio and incorporated into machine learning predictive models for analysis. SHapley Additive exPlanations visualization analysis was used to evaluate the importance of each variable. The Boruta algorithm identified 14 variables, including age, gender, education level, race, marital status, smoking history, alcohol consumption, poverty-income ratio, body mass index, METS-VF, presence of hypertension, diabetes, cancer, and cardiovascular disease. Among the various machine learning models evaluated, the CatBoost model demonstrated the highest area under the receiver operating characteristic curve, with a value of 0.640 (95% CI: 0.617-0.664). This finding underscores its superiority as the optimal predictive model in this context. SHapley Additive exPlanations visualization analysis revealed that body mass index was the most significant variable associated with asthma incidence risk, followed by gender, race, marital status, and smoking history. There is a certain association between the METS-VF and the risk of asthma incidence. CatBoost delivers high predictive accuracy alongside transparent interpretability, positioning it as an effective tool for asthma risk screening.