Abstract
Accurate estimation of soil organic carbon (SOC) is crucial for climate mitigation and sustainable land management. Near-infrared (NIR) spectroscopy provides a rapid, cost-effective approach for SOC assessment, but its predictive performance depends on calibration datasets with adequate spatiotemporal coverage. Here, we present the Gyeonggi Soil Spectral Library (G-SSL), comprising NIR spectra (1,400-2,500 nm) from 1,500 topsoil samples (0-15 cm) collected systematically across Gyeonggi Province, South Korea, in 2024. Sampling spans 11 representative land cover types, including deciduous, coniferous, and mixed forests; paddy and upland fields; orchards; greenhouses; urban parks; artificial grasslands; riparian zones; and bare lands. To develop an accurate NIR-based SOC prediction model, SOC measurements from 712 samples were used to calibrate partial least squares regression (PLSR) models, which showed robust performance in a 70:30 train-test split (R(2) = 0.95, RMSE = 0.39%, RPD = 4.54). The G-SSL provides a spatially robust, high-resolution resource for digital SOC mapping and establishes a methodological benchmark for developing region-specific spectral libraries in other heterogeneous landscapes.