Abstract
Background: Predicting genomic alterations from routine hematoxylin and eosin (H&E) whole-slide images (WSIs) may help triage molecular testing. Methods: We retrospectively enrolled 437 patients at Osaka Metropolitan University Hospital across 26 cancers, matched with clinical gene-panel data. We curated 1023 binary endpoints across SNV, CNV, and SV categories. We extracted slide embeddings from five pathology foundation models (Prism, GigaPath, Feather, Chief, and Titan) using a unified feature extraction pipeline and benchmarked them using a lightweight downstream Multi-Layer Perceptron (MLP) classifier. Using the best-performing patch feature system, we trained a multi-instance learning model to assess incremental benefit. Results: Titan achieved the highest and most stable transfer performance, with a median endpoint-wise Area Under the Receiver Operating Characteristic curve (AUROC) of 0.77 in the slide benchmarking; at the patch-level, prediction of APC_SNV reached an AUROC of 0.916, and prediction of KRAS_SNV reached an AUROC of 0.811 on the held-out test set. Conclusions: In a heterogeneous clinical gene-panel setting, pathology foundation models can provide strong baseline genomic-prediction signals without additional fine-tuning. We propose a practical, deployment-oriented two-stage workflow: rapid slide-embedding screening to prioritize robust representations and candidate endpoints, followed by patch-level training for high-value tasks where additional performance gains and interpretable regions are clinically worthwhile.