Function inference of million-scale microbiomes using multi-GPU acceleration

利用多GPU加速对百万级微生物组进行功能推断

阅读:1

Abstract

Amplicon sequencing enables taxonomic profiling of microbial communities but offers limited insight into their functional potential. Existing tools such as Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt2) infer functions through phylogenetic placement and ancestral state reconstruction; however, these methods are computationally intensive and inefficient for large-scale data sets. To address these challenges, we introduce microbiome graphics processing unit (GPU)-based function inference (MGFunc), an ultra-high-throughput framework for microbiome functional inference leveraging multi-GPU acceleration. MGFunc transforms functional prediction for amplicons into standardized matrix multiplication using a pre-constructed genomic content network. It further integrates split data loading, matrix partition, and dynamic scheduling across multiple GPUs, enabling scalable, batch-wise analysis of millions of samples under limited GPU memory and system random access memory (RAM). Compared to PICRUSt2, MGFunc achieves speedups of up to several hundred thousand times, completing the functional interpretation of one million samples within one minute by four GPUs on a single server. This work provides a highly efficient and low-latency solution for ultra-large microbiome data sets functional inference, paving the way for global-scale microbiome studies. The MGFunc software is freely accessible at https://github.com/qdu-bioinfo/MGFunc.IMPORTANCEUnderstanding what microbes do-their functions-is essential for studying health, disease, agriculture, and the environment. While cost-effective sequencing methods like 16S rRNA gene analysis are widely used, they do not directly reveal microbial functions. Existing tools that predict these functions from 16S data are often too slow for today's large studies involving hundreds of thousands of samples. In this work, we developed microbiome graphics processing unit (GPU)-based function inference (MGFunc), a new method that predicts microbial functions quickly and accurately by using GPUs and a streamlined mathematical approach. MGFunc can analyze over one million samples in under a minute, making it one of the fastest tools available. This enables researchers to study the functional potential of microbial communities on a truly global and population scale.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。