Abstract
To address the accuracy-efficiency trade-off faced by deep learning models in structural crack detection, this paper proposes an optimized version of the YOLOv8 model. YOLO (You Only Look Once) is a real-time object detection algorithm known for its high speed and decent accuracy. To improve crack feature representation, the backbone is enhanced with the SimAM attention mechanism. A lightweight C3Ghost module reduces parameter count and computation, while a bidirectional multi-scale feature fusion structure replaces the standard neck to enhance efficiency. Experimental results show that the proposed model achieves a mean Average Precision (mAP) of 88.7% at 0.5 IoU and 69.4% for mAP@0.5:0.95, with 12.3% fewer Giga Floating Point Operations (GFlops), and faster inference. These improvements significantly enhance the detection of fine cracks while maintaining real-time performance, making it suitable for engineering scenarios.