Abstract
BACKGROUND: Artificial intelligence (AI)-based predictive systems generate heat maps that highlight informative regions as proxy explanations for diagnostic predictions. Comparing AI- and dermatologist-derived heat maps may help determine whether AI allocation aligns with clinically relevant features-an essential step toward improving interpretability. OBJECTIVE: To compare dermatologists and AI-generated heat maps of dermoscopic images. METHODS: Four dermatologists, blinded to diagnoses, inspected 120 dermoscopic images in randomized order. Their eye movements were tracked to generate heat maps. The dataset included melanomas, basal cell carcinomas, squamous cell carcinomas, nevi, benign keratoses, and vascular lesions in equal numbers. For the same images, class activation maps were produced using the DEXI algorithm (Dermoscopy EXplainable Intelligence, Canfield Scientific's Vectra software system). Overlap was assessed using pixel-wise rank correlation. Inter-dermatologist correlation served as the upper reference, and median null correlations between DEXI and non-homologous dermatologist maps defined the lower reference. RESULTS: Median pixel-wise correlation was ρ = 0.540 for dermatologist-DEXI, ρ = 0.591 for inter-dermatologist, and ρ = 0.434 for null comparisons. LIMITATION: Small sample sizes for each skin lesion type and the absence of lesion size data. CONCLUSIONS: Dermatologists and DEXI heat maps showed substantial overlap, suggesting shared diagnostic anchors, underscoring the model's potential for interpretability.