Abstract
Single-cell transcriptomics offers the promise of measuring the diversity of cellular phenotypes across species, diseases, and other biological conditions. Recently, foundation models have emerged to identify this variation, yet most methods represent each cell independently, despite technical limitations that reduce measurement precision at the single-cell level. Here, we present Stack, a foundation model trained on 149 million uniformly preprocessed human single cells that leverages tabular attention to generate representations for each cell informed by the cells in its context. Stack offers substantial improvements for downstream tasks in the zero-shot setting compared to baselines, whether they are zero-shot, fine-tuned, or trained from scratch on the target dataset. Stack can perform in-context learning from unlabeled cells representing arbitrary conditions, such as a chemical perturbation or a different donor, and predict the effect of those conditions on a target cell population without requiring data-specific fine-tuning. We apply Stack to generate Perturb Sapiens, the first human whole-organism atlas of perturbed cells, spanning 28 tissues, 40 cell classes, and 201 perturbations. We validated subsets of Perturb Sapiens using in vitro stimulation profiles. Overall, Stack presents a new modeling framework where cells themselves act as guiding examples at inference time, unlocking general-purpose in-context learning capabilities for single-cell biology.