Abstract
Doublets in single-cell sequencing data, caused by the simultaneous capture of two or more cells within a single reaction volume, introduce biases that compromise downstream analysis. Existing doublet detection methods primarily focus on single-modality data and exhibit limited robustness across datasets. To overcome these limitations, we developed OmniDoublet, a multimodal doublet detection method that integrates transcriptomic and epigenomic data. OmniDoublet leverages the Jaccard similarity coefficient to calculate weights that assess the reliability of neighboring cells across modalities, combining doublet scores from different modalities into a final integrated score. It further employs a Gaussian mixture model (GMM) to establish thresholds, enabling accurate binary classification of cells as singlets or doublets based on the integrated score. OmniDoublet offers a robust framework for detecting doublets across diverse scenarios. Benchmarking against state-of-the-art methods across various datasets demonstrates that OmniDoublet achieves superior accuracy, robustness, and scalability. By harnessing the comprehensive information from multimodal single-cell data, OmniDoublet enhances doublet detection, enabling researchers to gain more accurate and reliable insights into cellular processes.