Abstract
Maintaining worker safety on hazardous construction sites remains a serious challenge, especially due to the rampant shortage or misuse of Personal Protective Equipment (PPE), which is a major contributor to accidents in at-height operations. Conventional monitoring systems are usually not as adaptable or accurate as needed in real-world situations. To address this, a Discrete Dilated Cosine Causal Convolution Krawtchouk Orangutan Multi-Tchebichef Head Self-Attention Network (2D-3CKO-MTHSAN) is proposed, which incorporates Discrete Dilated Cosine Causal Convolution with Multi-Head Self-Attention (DCCMSA) and Discrete Cosine-Krawtchouk-Tchebichef Transform (DCKTKT) to enhance PPE detection accuracy and robustness. The approach uses real-time drone surveillance to continuously acquire visual data at actual construction sites and controlled lab settings, with emphasis on identifying the salient elements of the Personal Fall Arrest System (PFAS), such as helmets, harnesses, and lifelines. A two-stage preprocessing pipeline comprising entropy filtering and τ-Kendall correlation analysis improves image quality and feature prominence. Deep spatio-temporal features are extracted using Adaptive Causal Decision Transformers, and the DCCMSA-DCKTKT model parameters are adjusted using the Orangutan Optimization Algorithm (OOA) to achieve stable performance under changing environmental conditions. Experimental results confirm the system's superiority over conventional deep learning models, achieving 99.9% detection accuracy. Compared to other fixed CCTV image-based or single-stage CNN-based PPE detection methods with limited scalability, the new hybrid 2D-3CKO-MTHSAN approach integrates adaptive optimization and multi-scale attention in an innovative manner to enable real-time, accurate drone-based detection. This is an example of our solution's novelty in integrating discrete transforms with the Orangutan Optimization Algorithm (OOA) for efficient field deployment.