Abstract
This study proposes a logistic regression-integrated cellular automata (CA) model for oil spill simulation, addressing challenges in parameter determination of traditional CA models. The method involves data preprocessing (geospatial alignment, resampling, normalization), Monte Carlo sampling for training data, logistic regression-based weight assignment to impact factors, neighborhood function and stochastic term computation, and iterative oil spill simulation. The model can be calibrated through sensitivity analyses of sampling ratios, spatial scales, and neighborhood structures. Finally, it was validated using DeepSpill experimental data. Results show optimal accuracy (97.40 %) under 22 % sampling ratio, 12.61 % oil area proportion, 6 m spatial scale, and 7 × 7 Moore neighborhood.•Innovative Model Integration & Calibration: Merged logistic regression with CA to objectively quantify environmental drivers (currents, wind, salinity) and optimize parameters (sampling, scale and neighborhood) in oil simulation.•Dynamic Optimization & Scale Sensitivity: Peak accuracy (96.41 %) can be obtained at 22 % sampling rate and 12.61 % oil area. 97.32 % accuracy at 6 m resolution balances resolution and boundary roughness.•Neighborhood-Driven Diffusion Enhancement: 7 × 7 Moore neighborhood boosts accuracy to 97.40 % (vs. 3 × 3), proving neighborhood size critically shapes diffusion dynamics.