Abstract
Accurately forecasting traffic congestion on urban expressways remains challenging, especially under unstable flow conditions where conventional machine learning models often suffer from reduced accuracy and interpretability. This study introduces a domain-theoretic machine learning framework designed for real-time congestion prediction on the Chalong Rat Expressway in Bangkok, Thailand. Feature engineering incorporates principles from the macroscopic cell transmission model, Kerner's three-phase theory, and Helbing's microscopic dynamics to capture key interactions such as density-flow relationships, jam propagation, and driver response gradients. A hybrid random forest-XGBoost ensemble is developed and evaluated against standard machine learning baselines. The results demonstrate that the proposed ensemble achieved superior performance across mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R(2)), and prediction interval coverage (PICP), particularly near congestion transition boundaries. SHapley Additive exPlanations (SHAP) analysis confirmed corrected outflow, jam speed, and repulsive force as dominant predictors, underscoring the model's interpretability. By integrating traffic theory with interpretable machine learning, this framework enables accurate, explainable, and deployable real-time congestion forecasting for intelligent transportation systems.