ScaleSC: a superfast and scalable single-cell RNA-seq data analysis pipeline powered by GPU

ScaleSC:一个由GPU驱动的超快速、可扩展的单细胞RNA测序数据分析流程。

阅读:1

Abstract

SUMMARY: The rise of large-scale single-cell RNA-seq data has introduced challenges in data processing due to its slow speed. Leveraging advancements in Graphics Processing Unit (GPU) computing ecosystems, such as CuPy and Compute Unified Device Architecture (CUDA), building on Scanpy and Rapids-singlecell package, we developed ScaleSC, a GPU-accelerated solution for large-scale single-cell data processing. ScaleSC delivers over a 20× speedup through GPU computing and significantly improves scalability, handling datasets of 10-20 million cells with over 1000 batches by overcoming the memory bottleneck on a single A100 card, which far surpasses Rapids-singlecell's capacity of processing only 1 million cells without multi-GPU support. We also resolved discrepancies between GPU and Central Processing Unit (CPU) algorithm implementations to ensure consistency. In addition to core optimizations, we developed novel tools for marker gene identification and cluster merging with GPU-optimized implementations seamlessly integrated. ScaleSC shares a similar syntax with Scanpy, which helps lower the learning curve for users already familiar with Scanpy workflows. AVAILABILITY AND IMPLEMENTATION: The ScaleSC package (https://github.com/interactivereport/ScaleSC) promises significant benefits for the single-cell RNA-seq computational community.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。