Abstract
Spatial transcriptomics has revolutionized the analysis of gene expression while preserving tissue spatial information, which provides novel insights into the cellular composition and function of complex biological tissues. However, current technologies are constrained by limited resolution and data sparsity, compromising the accuracy of downstream analyses. To address these challenges, we developed SpaVGN, a deep learning framework integrating convolutional neural networks, vision transformer, and graph neural networks for high-fidelity gene expression imputation and spatial domain identification. By combining local feature extraction, global attention mechanisms, and spatial graph-based modeling, SpaVGN effectively reconstructs missing transcriptomic data while preserving spatial tissue architecture. Evaluated on melanoma and sagittal posterior mouse brain datasets, SpaVGN outperformed existing methods in gene expression prediction, achieving Pearson correlation coefficients of 0.609 (melanoma) and 0.682 (mouse brain). It clearly delineated tumor regions and lymphoid niches in melanoma tissue, achieving fine-grained resolution of hippocampal subfields, including Cornu Ammonis and Dentate Gyrus, with a Silhouette Score of 0.43 and a Davies-Bouldin Index of 0.86. Validation through UMAP dimensionality reduction and PAGA network analysis demonstrated that SpaVGN significantly mitigates the negative impact of data sparsity in spatial transcriptomics, improving data completeness and spatial continuity. This study presents an innovative solution that enhances the resolution of spatial transcriptomics data, offering cross-tissue applicability and providing a valuable tool for research in biological development, disease, and tumor heterogeneity.