Abstract
Digital Soil Mapping (DSM) techniques have advanced significantly in recent decades, helping to close critical gaps in soil data and knowledge. This study was conducted in the arid Gavkhouni sub-basin of Isfahan Province, central Iran, where environmental stresses such as salinity and water scarcity challenge sustainable land management. We employed 34 environmental covariates derived from Landsat 8 imagery and a digital elevation model, combined with 96 surface soil samples (0 to 20 cm depth), to assess the performance of six machine-learning models: Random Forest (RF), Classification and Regression Tree (CART), Support Vector Regression (SVR), Generalized Additive Model (GAM), Generalized Linear Model (GLM), and an ensemble approach. Unlike many previous studies that have focused on a single soil attribute with a limited set of predictors, our work adopts an integrated approach to map four salinity-related soil properties: Ca, CaCO(3), CaSO(4), and SO(4). Predictor selection involved multicollinearity testing using the Variance Inflation Factor (VIF) and the Boruta algorithm. Model performance was assessed using tenfold cross-validation. The ensemble model performed best, achieving R(2) values of 0.89 for Ca, 0.84 for CaCO(3), 0.79 for SO(4), and 0.73 for CaSO(4). Elevation and the Temperature-Vegetation Dryness Index (TVDI) were the most influential predictors for Ca, while the Tasseled Cap Brightness (TCB) and Tasseled Cap Wetness (TCW) indices were most important for CaCO(3). For CaSO(4), Band 5 (B5) and TCB were the most effective, whereas SO(4) predictions were driven by TCB along with Bands 5 and 7. These findings highlight the potential of remote sensing-based DSM to enhance soil monitoring in data-scarce, arid environments. The growing availability of free satellite data, such as Landsat, offers valuable opportunities to improve soil assessment and promote sustainable land management in resource-limited regions like Iran.