Abstract
This study aims to forecast dengue incidence in Bangladesh by applying and comparing machine learning techniques. Dengue surveillance data from January 1, 2022, to December 1, 2023, for five divisions of Bangladesh was obtained from the Directorate General of Health Services. Initial time series analysis showed distinct trends and seasonality. Seasonal Autoregressive Integrated Moving Average (SARIMA), Multi-Layer Perceptron neural networks, XGBoost, and Support Vector Regression (SVR) were implemented for modeling and forecasting monthly dengue cases for each division. The models were evaluated using error metrics like RMSE, MAE, and MAPE. The results indicate that XGBoost provided the most accurate forecasts with the lowest errors overall. For the Dhaka division, which had the most data, XGBoost captured seasonality and trends well, with an RMSE of 109, an MAE of 127, and an MAPE of 12.9%. SARIMA models were reasonably good but had higher errors than the machine learning models. SVR was found unsuitable for forecasting this data. In summary, advanced machine learning models like XGBoost proved effective for infectious disease forecasting using limited surveillance data in resource-constrained settings. The forecasts can assist public health planning in Bangladesh.