Abstract
INTRODUCTION: The exponential growth of heterogeneous, high-velocity CyberSecurity data generated by modern digital infrastructures presents both opportunities and challenges for threat detection, especially against increasingly sophisticated cyber-attacks. Traditional security tools struggle to process such data effectively, highlighting the need for scalable Big Data Analytics and advanced Machine Learning (ML) techniques. However, the black-box nature of many ML models limits interpretability, trust, and regulatory compliance in high-stakes environments. METHODS: This study proposes an integrated framework that combines Big Data technologies, ML models, and Explainable Artificial Intelligence (XAI) to enable accurate, transparent, and real-time phishing attack detection. The framework leverages distributed computing and stream processing for efficient handling of large and diverse datasets while incorporating XAI methods to generate human-understandable model explanations. RESULTS: Experimental evaluation conducted on four publicly available CyberSecurity datasets demonstrates improved phishing detection performance, enhanced interpretability of model decisions, and actionable insights into malicious URL behavior and patterns. DISCUSSION: The proposed approach advances interpretable and scalable CyberSecurity analytics by addressing the gap between predictive accuracy and decision transparency. By integrating Big Data processing with XAI-driven ML, the framework offers a trustworthy solution for real-time threat detection, supporting informed decision-making and regulatory compliance.