Abstract
INTRODUCTION: The study investigates the relationship between blood lipid components and metabolic disorders, specifically high-density lipoprotein cholesterol (HDL-C), which is crucial for cardiovascular health. It uses logistic regression (LR), decision tree (DT), random forest (RF), K-nearest neighbors (KNN), XGBoost (XGB), and neural networks (NN) algorithms to explore how blood factors affect HDL-C levels in the bloodstream. METHOD: The study involved 9704 participants, categorized into normal and low HDL-C levels. Data was analyzed using a data mining approach such as LR, DT, RF, KNN, XGB, and NN to predict HDL-C measurement. Additionally, DT was used to identify the predictive model for HDL-C measurement. RESULT: This study identified gender-specific hematological predictors of HDL-C levels using multiple ML models. Logistic regression exhibited the highest performance. NHR and LHR were the most influential predictors in males and females, respectively, with SHAP analysis confirming their critical roles alongside LYM, NEUT, and WBC in HDL-C classification. DISCUSSION: The results show that blood inflammation plays a role in HDL-C homeostasis. The mechanisms of these relationships are not fully understood, but a complex interplay between inflammation and HDL-C levels as well as cardiometabolic health is evident. These findings support the pathophysiological role of inflammatory pathways in cardiometabolic disorders and provide insights into how modulation of hematological inflammation may contribute to disease prevention or treatment.