Abstract
Bulk RNA sequencing (RNA-seq) deconvolution typically uses single-cell RNA sequencing (scRNA-seq) references, but some cells are only detectable through single-nucleus RNA sequencing (snRNA-seq). Because snRNA-seq captures nuclear, not cytoplasmic, transcripts, its direct use as a reference could reduce deconvolution accuracy. We benchmarked integration strategies across four tissues, comparing principal component (PC)-based latent shifts, conditional and non-conditional scVI (single cell variational inference), and cross-modality differentially expressed gene (DEG) filtering. All approaches improved over raw snRNA-seq, but pruning cross-modality DEGs produced the largest gains, often matching or exceeding scRNA-only references. Conditional scVI performed comparably and was effective when matched scRNA-snRNA cell types were unavailable. In real adipose bulk samples, DEG pruning and conditional scVI provided the most robust cell-fraction estimates across donors and transformations. These results demonstrate that scRNA-seq should be prioritized as a reference when available, and we recommend appending snRNA-seq only after removing cross-modality DEGs; when DEG information is limited, conditional scVI is a practical alternative.