Abstract
Accurate cell type annotation in imaging mass cytometry (IMC) and related technologies critically depends on preprocessing steps such as normalization, segmentation, and marker aggregation. More distinct separation between negative and positive signals enables more precise cell boundary inference and more robust marker assignment to single cells. However, inherent limitations in spatial resolution and uncertain cell boundaries can lead to spillover, where signal leaks from one cell to a neighboring cell, distorting marker intensities and leading to incorrect cell type annotation. To address these challenges, we first systematically investigated the impact of spatial resolution and segmentation variability on per-cell marker aggregation using simulated IMC datasets, establishing upper limits for reliable marker separation in cell type annotation. We further analyzed technical biases in large scale studies, demonstrating that appropriate normalization approaches can significantly reduce batch effects without compromising biological variability. Finally, we benchmarked multiple spillover correction strategies across both (semi-)simulated and real IMC datasets. Our results revealed two simple methods: spatial smoothing of intensity images followed by mean marker aggregation, or resampling of cell masks followed by mean marker aggregation and median calculation, both of which improved annotation performance. Other methods often performed worse than the baseline of simple mean aggregation. Together, these findings underscore the central importance of spatial resolution, normalization, and marker aggregation in IMC data preprocessing for accurate single-cell annotation.