Abstract
OBJECTIVES: Skin cancer is the most common malignancy in the United States, with more than five million cases diagnosed annually among 3.3 million individuals. Melanoma, the deadliest form of skin cancer, accounts for roughly 200 000 new diagnoses each year and nearly 10 000 deaths. AI-based skin cancer detection is being developed and tested in laboratory and academic settings as a promising approach to improve access and reduce disparities. However, current models often underperform on darker skin tones (Fitzpatrick Types V and VI), creating fairness concerns that must be addressed prior to clinical deployment. Existing fairness-aware methods focus on algorithmic adjustments while neglecting data quality and representation. We introduce FAIR-SCAN (Fairness and Accuracy through Ranking-Based Subset Selection for Skin Cancer Detection), a data-centric framework that enhances fairness through subset selection guided by marginal contribution score (MCS) estimation. MATERIALS AND METHODS: FAIR-SCAN ranks data points by their contribution to both accuracy and fairness, then selects an optimal subset for training. We evaluated its effectiveness using images from Diverse Dermatology Images (DDI) and Fitzpatrick 17K. RESULTS: FAIR-SCAN improved balance in accuracy, True Positive Rate, and False Positive Rate across skin tones while reducing the training dataset by 50%, outperforming algorithm-focused fairness methods. DISCUSSION: These findings highlight the importance of strategic data selection in mitigating bias in AI-driven diagnostics. FAIR-SCAN's data-centric approach enhances both precision and equity in skin cancer detection. CONCLUSION: Strategic data selection is critical for equitable AI-driven diagnostics. FAIR-SCAN advances fairness and accuracy in skin cancer detection, supporting development of trustworthy clinical AI systems.