Abstract
Mass spectrometry imaging (MSI)-based spatial metabolomics exhibits extensive missing values; yet, practical guidance on how imputation choices affect both imputation accuracy and downstream spatial analyses remains limited. In this study, we evaluated eight imputation methods, including both existing approaches and a graph convolutional network (GCN)-based method specifically designed for spatial metabolomics data, to identify suitable approaches for spatial metabolomics. To enable comprehensive assessment, we developed an evaluation framework focusing on two objective criteria: (a) imputation accuracy and (b) preservation of spatial cluster structure. We assembled six benchmark datasets spanning mouse brain and liver, human kidney and stomach, and plant seed sections, and conducted controlled dropout simulations of missing values. Across both evaluation dimensions, including imputation accuracy and preservation of spatial cluster structure, RF ranked first overall, and GCN ranked second in both dimensions. Overall, this systematic, dual-perspective benchmark study provides guidance for selecting imputation strategies in spatial metabolomics research.