Abstract
The incidence of allergic rhinitis (AR) has been increasing annually, severely impacting patients' quality of life and increasing socioeconomic burdens. The limitations of current diagnostic methods have made the development of efficient, low-cost early screening tools urgent. Based on routine blood test data, this study employed an ensemble hard voting strategy, a comprehensive filtering strategy, an embedding strategy, and a packing strategy to select 16 highly correlated features with a frequency of at least two occurrences as model inputs. Subsequently, the top three machine learning algorithms (K-nearest neighbor, logistic regression, random forest, decision tree, and support vector machine) were selected based on the area under the curve (AUC) metric as the base classifiers. An intelligent early screening model for AR was constructed using an ensemble soft voting strategy. This model demonstrated superior performance, achieving an AUC of 0.862, significantly outperforming any single algorithm. Furthermore, the external validation accuracy was 73.91%. These results demonstrate that combining an ensemble voting strategy with machine learning methods can effectively construct an early screening model for AR based on routine blood test parameters without adding additional burden to patients, providing a new approach to improving diagnosis and treatment in primary care settings.