Abstract
Air pollution forecasting plays a vital role in mitigating the adverse impacts of deteriorating air quality on public health and urban sustainability. This study presents a data-driven hybrid framework for urban air quality forecasting, focusing on the accurate prediction of [Formula: see text] concentrations. To ensure reliable model training, the Beijing Air Quality dataset was carefully preprocessed, including handling of missing values, removal of outliers, and feature selection based on correlation analysis. The proposed model introduces a residual-enhanced hybrid forecasting framework that integrates the statistical interpretability of Prophet with the nonlinear learning capacity of Long Short-Term Memory (LSTM) networks. Prophet is first employed to capture long-term trend and seasonality components of [Formula: see text] time series, while an LSTM is trained on the residuals to model complex, nonlinear dependencies that Prophet cannot explain. Extensive experiments conducted on the air quality dataset demonstrate that the proposed Prophet+LSTM model significantly outperforms both traditional statistical methods and advanced deep learning baselines, achieving lower MAE and RMSE compared to AC-LSTM, LSTM-FC, CNN-LSTM, and XGBoost. These results highlight the effectiveness of preprocessing and hybrid modeling for accurate air quality forecasting and provide a framework for urban air quality monitoring.