Abstract
The rise in trusted machine learning has prompted concerns about the security, reliability and controllability of deep learning, especially when it is applied to sensitive areas involving life and health safety. To thoroughly analyze potential attacks and promote innovation in security technologies for DNNs, this paper conducts research on adversarial attacks against medical images and proposes a medical image attack method that focuses on lesion areas and has good transferability, named LatAtk. First, based on the image segmentation algorithm, LatAtk divides the target image into an attackable area (lesion area) and a non-attackable area and injects perturbations into the attackable area to disrupt the attention of the DNNs. Second, a class activation loss function based on gradient-weighted class activation mapping is proposed. By obtaining the importance of features in images, the features that play a positive role in model decision-making are further disturbed, making LatAtk highly transferable. Third, a texture feature loss function based on local binary patterns is proposed as a constraint to reduce the damage to non-semantic features, effectively preserving texture features of target images and improving the concealment of adversarial samples. Experimental results show that LatAtk has superior aggressiveness, transferability and concealment compared to advanced baselines.