Abstract
Seismic phase picking aims to accurately identify and label the arrival times of different types of seismic waves (e.g., P-waves and S-waves) from waveform data, serving as a fundamental step in seismological research and related applications. Although existing deep learning-based methods have achieved notable accuracy, their architectural designs are often overly complex. In this paper, we propose a Multi-Scale Fusion U-Net architecture (MFU-Net), aiming to achieve significant recognition performance through straightforward enhancements to the traditional U-Net design. Specifically, we design a multi-scale feature fusion module within the skip connections of the U-Net architecture to effectively integrate multi-scale semantic and spatial information. This is followed by the incorporation of a multi-head attention mechanism in the bottleneck layer to enhance the recognition of critical feature regions. Finally, a weighted class-balanced loss is introduced into the loss function to improve the model's ability to identify minority classes. Tests conducted on the seismic dataset provided by the Fujian Earthquake Agency and the open-source STEAD dataset show that, compared to GPD, SegPhase, and SEANet, the proposed MFU-Net achieves improvements of 1.6% and 1.4% in P-wave picking accuracy, and enhancements of 4.1% and 2.7% in S-wave picking accuracy, respectively.