Abstract
OBJECTIVES: This study aimed to develop an optimized ensemble learning model to improve the prediction of hypertension complicated by coronary heart disease (CHD) through advanced feature selection and classifier fusion, thereby enhancing both accuracy and stability in risk assessment. METHODS: We constructed an ensemble-based predictive model using voting fusion to enhance early detection of hypertension complicated by CHD. The dataset comprised 2,487 patients with essential hypertension (EH) complicated by CHD and 3,904 non-CHD controls. Following data preprocessing procedures, including data cleaning and univariate and multivariate feature selection, an 18-dimensional feature set was derived. Five machine learning algorithms (logistic regression, random forest, XGBoost, CatBoost, and CART) were trained independently and subsequently integrated through a voting ensemble to optimize predictive performance. RESULTS: The voting fusion model outperformed all individual classifiers, achieving an area under the curve of 0.906 and an accuracy of 0.888 in predicting EH complicated by CHD. CONCLUSIONS: The proposed ensemble model improves classification accuracy and robustness, offering a clinically useful tool for early risk stratification of hypertension-associated CHD. Although the model demonstrates strong predictive performance using cross-sectional data, its reliance on single-timepoint measurements and selected control populations necessitates further validation. Pending additional studies, this framework may serve as a supplementary decision-support tool within clinical informatics systems.