Abstract
As the largest global photovoltaic (PV) market, China experiences continuous rapid growth in PV installed capacity, playing a crucial role in achieving carbon peaking and neutrality goals through this central pillar of the energy transition. To address data fragmentation and inconsistency in current PV datasets, this study develops the 2024 China Photovoltaic Power Plant Vector Dataset (CPVPD-2024) using a deep semantic segmentation framework (DSFA-SwinNet) with geospatial verification. The dataset comprehensively covers all 34 provincial-level administrative regions of China, achieving an overall Precision of 90.38% and Intersection over Union (IoU) of 81.78% in test zones, demonstrating significant improvements in identifying PV array gaps and detecting small-scale distributed power plants. Research results indicate that the total installed PV area in China reached 4,520.47 km² by 2024, exhibiting a characteristic spatial pattern dominated by agrivoltaic systems with concentrated distribution in arid regions. As the first national panel-level PV vector dataset, it enables precise PV site selection, ecological assessments, and AI-driven remote sensing analysis.