Abstract
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) technology enables biological research at single-cell resolution. Cell clustering is a crucial task in scRNA-seq data analysis since it provides insights into cell heterogeneity. Although existing methods have made significant progress in this task, it remains challenging to fully utilize the relationship among cells. RESULTS: We propose Interpretable Graph Contrastive Learning method with Adaptive Positive Sampling (IGCLAPS), a novel end-to-end graph contrastive clustering method for scRNA-seq data analysis. Specifically, IGCLAPS learns low-dimensional embeddings with a graph transformer, based on which a dual-head graph contrastive learning module is used to perform dimension reduction and cell clustering simultaneously. Besides, an accurate definition of positive sample pairs is crucial in contrastive learning, we devise an adaptive positive sampling module, which dynamically identifies true positive sample pairs based on both expression similarity and soft cluster labels generated by the contrastive learning module. Extensive experiments on a series of real datasets including cell clustering, visualization, and differential expression analysis demonstrate that IGCLAPS can effectively enhance clustering performance and generate interpretable gene expression patterns of scRNA-seq data. AVAILABILITY AND IMPLEMENTATION: The source codes of IGCLAPS are available at https://github.com/ZhengWeihuaYNU/IGCLAPS.