Abstract
In today's digital era, cyberattacks are rapidly evolving, rendering traditional security mechanisms increasingly inadequate. The adoption of AI-based Network Intrusion Detection Systems (NIDS) has emerged as a promising solution, due to their ability to detect and respond to malicious activity using machine learning techniques. However, these systems remain vulnerable to adversarial threats, particularly data poisoning attacks, in which attackers manipulate training data to degrade model performance. In this work, we examine tree classifiers, Random Forest and Gradient Boosting, to model black box poisoning attacks. We introduce FortiNIDS, a robust framework that employs a surrogate neural network to generate adversarial perturbations that can transfer between models, leveraging the transferability of adversarial examples. In addition, we investigate defense strategies designed to improve the resilience of NIDS in smart city Internet of Things (IoT) settings. Specifically, we evaluate adversarial training and the Reject on Negative Impact (RONI) technique using the widely adopted CICDDoS2019 dataset. Our findings highlight the effectiveness of targeted defenses in improving detection accuracy and maintaining system reliability under adversarial conditions, thereby contributing to the security and privacy of smart city networks.