sCIN: a contrastive learning framework for single-cell multi-omics data integration

sCIN:一种用于单细胞多组学数据整合的对比学习框架

阅读:1

Abstract

The rapid advancement of single-cell omics technologies such as single-cell RNA sequencing and single-cell assay for transposase-accessible chromatin with high throughput sequencing has transformed our understanding of cellular heterogeneity and regulatory mechanisms. However, integrating these data types remains challenging due to distributional discrepancies and distinct feature spaces. To address this, we present a novel single-cell Contrastive INtegration framework (sCIN) that integrates different omics modalities into a shared low-dimensional latent space. sCIN uses modality-specific encoders and contrastive learning to generate latent representations for each modality, aligning cells across modalities and removing technology-specific biases. The framework was designed to rigorously prevent data leakage between training and testing, and was extensively evaluated on three real-world paired datasets namely simultaneous high-throughput ATAC and RNA expression with sequencing, 10X PBMC (10k version), and cellular indexing of transcriptomes and epitopes, and one unpaired dataset of gene expression and chromatin accessibility. Paired datasets refer to multi-omics data generated using technologies capable of capturing different omics features from the same cell population while unpaired datasets are measured from different cell populations from a tissue. Results on paired and unpaired datasets show that sCIN outperforms state-of-the-art models, including scGLUE, scBridge, sciCAN, Con-AAE, Harmony, and MOFA+, across multiple metrics: average silhouette width for clustering quality, Recall@k, cell type@k, cell type accuracy, and median rank for integration quality. Moreover, sCIN was evaluated on simulated unpaired datasets derived from paired data, demonstrating its ability to leverage available biological information for effective multimodal integration. In summary, sCIN reliably integrates omics modalities while preserving biological meaning in both paired and unpaired settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。