Abstract
As spatial molecular data grow in scope and resolution, there is a pressing need to identify key spatial structures associated with disease. Current approaches typically make restrictive assumptions such as representing tissue regions by local abundances of manually typed, discrete cell types, or representing samples in terms of abundances of manually called, discrete spatial structures; this risks overlooking important signals. Here we introduce variational inference-based microniche analysis (VIMA), a method that combines deep learning with principled statistics to discover associated spatial features with greater flexibility and precision. VIMA trains an ensemble of variational autoencoders to extract numerical "fingerprints" from small tissue patches that capture their biological content. It uses these fingerprints to define a large number of data-dependent "microniches" - small, potentially overlapping groups of tissue patches with highly similar biology that span multiple samples. It then meta-analyzes across the autoencoders to identify microniches whose abundance correlates with case-control status while controlling for multiple testing. We show in simulations that VIMA is well calibrated. We then apply VIMA to spatial datasets spanning three different diseases and spatial modalities: a 7-marker immunofluorescence (IF) microscopy dataset in rheumatoid arthritis (RA), a 52-marker CO-Detection by indEXing (CODEX) dataset in ulcerative colitis (UC), and a 140-gene spatial transcriptomics dataset in dementia. In each case, we recapitulate known biology and identify novel spatial features of disease that were not discoverable with current state-of-the-art methods.