Abstract
A standard task in the analysis of spatially resolved transcriptomics data is to identify spatially variable genes (SVGs). This is most commonly done within one tissue section at a time because the spatial relationships between the tissue sections are typically unknown. However, large-scale spatial atlases are being generated, for example across hundreds of donors, where the goal is to identify a common set of SVGs to use for downstream analyses. One challenge is how to identify and remove SVGs that are associated with a known bias or technical artifact, such as the slide or capture area, which can lead to poor performance in downstream analyses, such as spatial domain detection. Here, we introduce BatchSVG, a tool to identify batch-biased genes in the application of SVG detection. Our approach compares the rank of per-gene deviance under a binomial model (i) with and (ii) without including a covariate in the model that is associated with the known bias or technical artifact. If the rank of a gene changes significantly between these, then we infer that this gene is likely associated with the bias or technical artifact and should be removed from the downstream analysis. We consider two SRT datasets and show how our model can improve the results of downstream analysis.