Abstract
MOTIVATION: Disease models are fundamental tools in drug discovery and early-stage drug development, but they only approximate human disease, and selecting a suitable model is challenging. Quantitative computational methods exist to assess molecular resemblance to human conditions, but approaching that work at single-cell resolution, and doing so in an explainable and generalizable way, remain very limited. RESULTS: We present singIST, a computational method for comparative single-cell transcriptomics analysis between disease models and human conditions. singIST provides explainable quantitative measures on disease model similarity to the human reference at the pathway, cell type and gene levels. These measures jointly account for gene orthology, cell type presence in the model, cell type and gene importance in the human condition, and gene level fold changes in the model, within a unifying framework that controls for the intrinsic complexities of single-cell data. We first test singIST in three well-characterized murine models against moderate-to-severe Atopic Dermatitis, showing that it recapitulates established biology while generating new hypotheses. We then apply it to Hidradenitis Suppurativa, comparing in vivo human lesions with ex vivo skin explants with and without CD3/CD28 stimulation, and show that stimulation selectively improves pathways that already recapitulate the human signal. Finally, we perform simulation studies that: (i) unit-test the implementation and behaviour of the algorithm under controlled scenarios and (ii) compare singIST against a naïve baseline based on overlapping differentially expressed genes.