Abstract
Rapid and accurate identification and tracking of lightning clusters from massive lightning detection data are crucial for real-time thunderstorm nowcasting and climatological analyses of thunderstorm activity. Although density-based clustering algorithms can identify clusters of arbitrary shapes at fine scales, their performance is often hindered by large data volumes and significant variations in lightning density. To address these challenges, we propose a multi-scale spatiotemporal lightning clustering framework, termed CC3D-CSCAP. It consists of two main components. First, the 3-D connected component algorithm (CC3D) performs coarse-scale segmentation by dividing the lightning dataset into spatiotemporally disconnected subsets using 26-connectivity. Then, the cylinder-based scan clustering algorithm with adaptive parameters (CSCAP) is applied to each subset for fine-scale identification of lightning clusters. Since the lightning subset may still contain multiple thunderstorms with varying lightning densities, CSCAP adaptively determines clustering parameters based on the statistical characteristics (time difference and spatial distance) of subset. Compared with fixed-parameter methods, CC3D-CSCAP identifies more clusters (771,033) while retaining a high percentage of usable lightning strokes (98.988%). The clustering results align well with the theoretical criteria for optimal clustering and are promising for global applications in lightning data analysis, nowcasting, and climatological studies of convective systems.