Abstract
Digital pathology images from different hospitals often exhibit variations in color styles due to differences in staining processes, tissue handling, and scanning devices. Integrating data from multiple centers is essential for developing artificial intelligence-driven digital pathology (AIDP) models with improved generalization. However, privacy concerns complicate data sharing, hindering this integration. Here, we propose a self-supervised model, stain lookup table (StainLUT), that leverages the inherent structural similarity between pathology tissue samples of the same disease type across different medical centers and enables stain normalization without the need for cross-center data transfer. Applied to single-center AIDP models, we achieve cross-center tumor localization at the whole-slide level and tumor classification at the patch level, performing comparably to AIDP models trained on centralized or same-center data. StainLUT offers a privacy-preserving solution for stain normalization in unseen medical centers, and holds the potential to facilitate the future deployment of AIDP foundational models under privacy regulations.