Abstract
Classical hematoxylin and eosin (H&E) staining enables review of tissue morphology but lacks information regarding the molecular state of cells. Immunohistochemical (IHC) techniques label specific proteins in tissue, allowing differentiation of relevant structures that may go undetectable in H&E. However, the IHC process is complex, expensive, and time-consuming, especially for multiplex IHC (mIHC) limiting its use in large cohorts. Stain conversion of H&E to IHC using generative artificial intelligence models such as generative adversarial networks (GANs) represent one solution to this problem. However, GANs are unstable during out of distribution sampling and are prone to hallucinations or mode collapse, limiting their accuracy in challenging image conversion tasks. To address this, the field has recently turned to diffusion models. Here, we introduce Schrödinger-bridge for Multiplex ImmunoLabel Estimation (SMILE). Unlike conventional diffusion models that map from source to target through an intermediate Gaussian noise, Schrödinger-bridge diffusion models skip this step and have been shown to better preserve structures during image translation. To test the performance of SMILE, we generated a large cohort of high-fidelity H&E-mIHC image pairs from pancreatic organ donors, targeting insulin, glucagon, and CD3. Our dataset well-sampled across type-1 diabetes status, pancreas anatomical location, age, and sex. Using this cohort, we demonstrate the superiority of SMILE compared to GANs via a comprehensive evaluation framework incorporating texture, distribution, and antibody-specific metrics, as well as blinded pathologist reviews. We further confirmed the ability of SMILE to generate accurate mIHC images from H&Es generated at an external site, to perform whole slide image conversion, and to generate realistic three-dimensional maps of the pancreatic islets in non-diabetic, auto-antibody positive, and type-1 diabetic donor tissue. Finally, we performed stain conversion of paired H&E to HER2 and Ki67 images in breast cancer, confirming the superiority of SMILE in diverse stain conversion applications. Collectively, this framework provides a scalable pipeline for high-throughput proteomic inference from archival H&Es, providing transformative potential for pancreatic research and digital pathology.