ReUseData: an R/Bioconductor tool for reusable and reproducible genomic data management

ReUseData:一个用于可重用和可复现基因组数据管理的 R/Bioconductor 工具

阅读:1

Abstract

BACKGROUND: The increasing volume and complexity of genomic data pose significant challenges for effective data management and reuse. Public genomic data often undergo similar preprocessing across projects, leading to redundant or inconsistent datasets and inefficient use of computing resources. This is especially pertinent for bioinformaticians engaged in multiple projects. Tools have been created to address challenges in managing and accessing curated genomic datasets, however, the practical utility of such tools becomes especially beneficial for users who seek to work with specific types of data or are technically inclined toward a particular programming language. Currently, there exists a gap in the availability of an R-specific solution for efficient data management and versatile data reuse. RESULTS: Here we present ReUseData, an R software tool that overcomes some of the limitations of existing solutions and provides a versatile and reproducible approach to effective data management within R. ReUseData facilitates the transformation of ad hoc scripts for data preprocessing into Common Workflow Language (CWL)-based data recipes, allowing for the reproducible generation of curated data files in their generic formats. The data recipes are standardized and self-contained, enabling them to be easily portable and reproducible across various computing platforms. ReUseData also streamlines the reuse of curated data files and their integration into downstream analysis tools and workflows with different frameworks. CONCLUSIONS: ReUseData provides a reliable and reproducible approach for genomic data management within the R environment to enhance the accessibility and reusability of genomic data. The package is available at Bioconductor ( https://bioconductor.org/packages/ReUseData/ ) with additional information on the project website ( https://rcwl.org/dataRecipes/ ).

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。