Abstract
As reported by the World Health Organization (WHO), Skin Cancer (SC) is a growing medical concern with millions of cases diagnosed each year. Deep Learning methods show promising results in early detection and classification of SC, but still face challenges in accurate detection due to non-homogeneous luminance, low contrast, and hair-on-skin images. Therefore, a novel transformer-based dual-architecture is proposed, comprising models for skin lesion classification and segmentation. The proposed classification model is based on the Global Contextual Vision Transformer (GlobalSkinNet), a combination of convolutional and transformer models. The proposed GlobalSkinNet model is trained using the selected parameters and the Adam optimizer for skin lesion classification. The GlobalSkinNet’s performance is assessed on four selected benchmark datasets: PH2, ISIC-2019, ISIC-2020, and HAM10000, achieving the accuracies of 100%, 98%, 97%, and 98%, respectively. The proposed SkinFormNet segmentation model subsequently processes the classified images. In this model, feature extraction is performed using the pre-trained SegFormer model, which is then input to the U-Net architecture with an attention mechanism. The segmentation model’s performance is evaluated on five benchmark datasets, including PH2, ISIC-2016, ISIC-2017, ISIC-2018, and HAM10000, achieving a dice coefficient of 0.97, 0.98, 0.94, 0.99, and 0.96, respectively.