Abstract
This paper presents an innovative approach to detecting cracks in asphalt pavement using an FCN-attention model, which integrates attention mechanisms into a fully convolutional network (FCN) for enhanced pixel-level segmentation. The model employs a ResNet-50-based encoder and incorporates channel-wise and spatial attention modules to refine feature extraction and focus on the most relevant image regions. The results demonstrate that the FCN-attention model outperforms traditional models such as VGG-16, AlexNet, MobileNet, and GoogleNet across multiple evaluation metrics. Specifically, the FCN-attention model achieves a global accuracy rate of 90.79%, with a precision of 92.3%, recall of 89.5%, and an F1-score of 90.9%. Additionally, the model achieves an average intersection-over-union (IoU) ratio of 69.7% and a test duration of 109.1 ms per image. The proposed method also excels in crack length and width calculation, providing real-world dimensions for the detected cracks. The model's effectiveness is further validated through an ablation study, which highlights the significant impact of the attention mechanism on model performance.