Satellite-based oil spill detection using an explainable ViR-SC hybrid deep learning ensemble for improved accuracy and transparency

基于卫星的溢油检测采用可解释的ViR-SC混合深度学习集成模型,以提高准确性和透明度

阅读:1

Abstract

Oil spills pose a severe threat to marine and coastal environments, requiring accurate and timely detection to reduce ecological and economic damage. Synthetic Aperture Radar (SAR) is widely used for marine monitoring due to its ability to capture ocean surface features under all-weather and day–night conditions. However, speckle noise and look-alike phenomena in SAR imagery significantly hinder reliable spill identification. To address these challenges, this study introduces an explainable deep learning framework comprising three quantitatively defined components that work together to improve detection accuracy. First, a denoising autoencoder with two convolutional layers (16 and 32 filters) and two transposed convolution layers is used to suppress SAR-specific speckle noise, improving downstream feature clarity and enhancing segmentation accuracy by stabilizing texture representation. Second, a U-Net + + segmentation network with nested skip connections and three encoder–decoder stages is employed to localize potential spill regions, providing structured spatial priors that guide the classifier toward more discriminative regions. Third, the ViR-SC ensemble classifier, which integrates five independently trained models—CNN, ResNet18, Vision Transformer, Support Vector Machine, and Random Forest—aggregates local, hierarchical, and global feature cues to improve classification robustness. The ensemble voting mechanism strengthens sensitivity to subtle slick structures while reducing errors arising from individual model biases. To ensure interpretability, Grad-CAM highlights class-discriminative spatial regions for CNN-based models, while SHAP quantifies feature importance for classical machine learning components. Experiments were conducted on a publicly available Sentinel-1 SAR dataset containing 5,630 labeled image patches (1905 oil, 3725 non-oil). Among single models, the Vision Transformer achieved 98.00% accuracy, whereas the proposed ViR-SC ensemble improved performance to 98.45%, demonstrating measurable gains from component integration. Explainability results further confirm that model decisions correspond to actual oil spill structures in the imagery.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。