Abstract
Background/Objectives: Repetitive behaviors such as hand flapping, body rocking, and head shaking characterize Autism Spectrum Disorder (ASD) while functioning as early signs of neurodevelopmental variations. Traditional diagnostic procedures require extensive manual observation, which takes significant time, produces subjective results, and remains unavailable to many regions. The research introduces a real-time system for the detection of ASD-typical behaviors by analyzing body movements through the You Only Look Once (YOLOv11) deep learning model. Methods: The system's multi-layered design integrates monitoring, network, cloud, and typical ASD behavior detection layers to facilitate real-time video acquisition, wireless data transfer, and cloud analysis along with ASD-typical behavior classification. We gathered and annotated our own dataset comprising 72 videos, yielding a total of 13,640 images representing four behavior classes that include hand flapping, body rocking, head shaking, and non_autistic. Results: YOLOv11 demonstrates superior performance compared to baseline models like the sub-sampling (CNN) (MobileNet-SSD) and Long Short-Term Memory (LSTM) by achieving 99% accuracy along with 96% precision and 97% in recall and the F1-score. Conclusions: The results indicate that our system provides a scalable solution for real-time ASD screening, which might help clinicians, educators, and caregivers with early intervention, as well as ongoing behavioral monitoring.