Abstract
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that is marked by a lack of communication skills in social situations and repetitive and stereotypical Behaviors. The most widespread form of diagnosing ASD among children is based on psychological screening test along with monitoring of the Behavioral pattern, especially repetitive Behaviors. Some of these Behaviors include hand-flapping, head banging and spinning which are common among ASD children. In our research, we examine abnormal Behavioral patterns that may reflect ASD through the videos of children engaged in the everyday activities in the unstructured settings. A publicly available multiclass Self-Stimulatory Behavior Dataset (SSBD) is use in classify autistic Behavior. Before training the model, the dataset is thoroughly pre-processed (region-of-interest (ROI) detection and image cropping to eliminate irrelevant background objects). Moreover, information-augmenting methods are used to reduce overfitting and increase training efficiency and generalization effectiveness. In order to obtain spatiotemporal details successfully, a number of deep learning models are tested, such as studied CNN-GRU model, 3D-CNN + LSTM, MobileNet, VGG16, and EfficientNet-B7. The findings of the experiment prove that the proposed CNN-GRU model is superior to all competing methods. The model with a k-fold cross-validation provides a steady accuracy of 0.9284 ± 0.0039-0.9294 ± 0.0038, which means that the model is robust and consistent across the folds. The effectiveness of the proposed approach is additionally justified by the comparisons with state-of-the-art methods. The results show that the systems based on the action recognition can help clinicians monitor the Behavioral trends and facilitate the quick, accurate, and effective screening of ASD. The proposed approach works effectively in predicting Behavior in real-life, uncontrolled videos and shows tremendous potential for real-world clinical implementation as a decision-support tool.