Abstract
The escalating size and complexity of Distributed Denial of Service (DDoS) attacks pose significant threats to the security and availability of cloud computing infrastructure. Traditional Intrusion Detection Systems (IDSs), which rely primarily on static or signature-based methods, are inadequate in adapting to the rapidly evolving nature of modern attack vectors. This paper proposes a deep reinforcement learning (DRL)-based framework for real-time DDoS detection in cloud environments. Specifically, the study investigates three actor-critic DRL algorithms: Twin Delayed Deep Deterministic Policy Gradient (TD3), Deep Deterministic Policy Gradient (DDPG), and Advantage Actor-Critic (A2C) to differentiate between benign and malicious network traffic. A robust hybrid feature selection strategy is introduced, combining Boruta (wrapper-based), SHAP (model explainability-based), and cross-validation stability analysis to ensure that selected features are statistically robust, interpretable, and consistent across datasets. The model is trained and evaluated on two benchmark datasets, CICDDoS2019 and UNSW-NB15, following a comprehensive preprocessing pipeline that includes feature selection, normalization, and class imbalance handling through reward shaping and stratified experience replay. Experimental results demonstrate that the TD3 algorithm achieves superior performance, with an average accuracy of 99.12%, an AUC of 99.21%, and an inference latency of 1.87 milliseconds per sample, making it suitable for real-time deployment. An ablation study confirms the critical contribution of each preprocessing component, and SHAP-based analysis is employed to interpret model decisions by identifying key traffic features influencing predictions. The findings underscore the effectiveness, scalability, and interpretability of the proposed DRL-based approach, particularly TD3, in overcoming the limitations of traditional IDSs and providing an adaptive solution for DDoS detection in dynamic cloud environments.