Abstract
Tourism has become a central pillar of Saudi Arabia's Vision 2030, supporting national efforts to diversify the economy and reduce reliance on oil revenues. As the Kingdom expands its religious, recreational, heritage, and sports tourism sectors, accurate forecasting of tourist demand has become essential for sustainable planning, infrastructure development, and policy formulation. This study investigates the use of machine learning (ML) techniques to predict tourist arrivals in Saudi Arabia using a city-level dataset covering 2021-2023. A range of regression and ensemble models-including Random Forest, Gradient Boosting, HistGradientBoosting, and combinatorial stacking and voting ensembles-were evaluated across multiple experimental scenarios. The VotingR2 ensemble consistently achieved the strongest performance, with R² values of 0.9601 in mixed-year training (Scenario A), 0.8735 in temporal holdout evaluation (Scenario B), and 0.9578 under 10-fold time-series cross-validation (Scenario C). Long-horizon projections for 2024-2034 were generated using Prophet-driven feature extrapolation combined with ML-based forecasting. The results demonstrate the effectiveness of ensemble methods in capturing nonlinear tourism patterns and provide actionable insights to support strategic decision-making, infrastructure optimization, and tourism policy development within the framework of Vision 2030.