Abstract
BACKGROUND: Rift Valley fever (RVF) is a mosquito-borne zoonotic disease for which predictive modeling is often hindered by sparse data, particularly the high frequency of zero counts in both human and livestock surveillance systems. While zero-inflated models are commonly used for sparse data, several temporal count modelling frameworks exist, including less common self-exciting models that assume an initial case increases the likelihood of subsequent cases. METHODS: This study compares three zero-inflated Bayesian models: the negative binomial (ZINB) with autoregressive temporal random effects, the self-exciting negative binomial (SE-NB) and the generalized autoregressive moving average negative binomial (GARMA-NB). The models were evaluated across simulated datasets with varying levels of sparsity. RESULTS: We found that zero-inflation substantially improves predictive performance within specific sparsity thresholds: 29-94.5% (ZINB), 25-93% (SE-NB), and 30-95% (GARMA-NB). Applied to monthly RVF incidence data from northern Kenya (2018-2024), the ZINB model with a three-month rainfall lag provided the most accurate forecasts. CONCLUSION: These findings underscore the importance of zero-inflated negative binomial models and climate-based covariates in enhancing early warning systems for RVF-endemic regions.