Uniform processing and analysis of IGVF massively parallel reporter assay data with MPRAsnakeflow

利用 MPRAsnakeflow 对 IGVF 大规模并行报告基因检测数据进行统一处理和分析

阅读:1

Abstract

As researchers and clinicians seek to identify human genomic alterations relevant to traits and disorders, identifying and aggregating evidence providing mechanistic support for associations between alterations and phenotypes remains challenging. In particular, the study of non-coding genomic variation remains a major challenge due to the lack of accurate functional annotation for activity in a given context and across alleles. Experimental evidence is critical for prioritizing and interpreting functional effects of genetic alterations. Massively Parallel Reporter Assays (MPRAs) have emerged as a powerful high-throughput approach, enabling quantification of regulatory element activity and allelic effects, and systematic dissection of gene regulatory logic and variant effects across different contexts. However, the diversity of MPRA designs, lack of standardized formats, and many potential processing parameters hamper data integration, reproducibility, and meta-analyses across studies. To address these challenges, the Impact of Genomic Variation on Function (IGVF) Consortium established an MPRA focus group to develop community standards, including harmonized file formats, and robust analysis pipelines for a wide range of library types and experimental designs. Here, we present these formats and comprehensive computational tools, MPRAlib and MPRAsnakeflow, for uniform processing from raw sequencing reads to counts, processing and visualization. Using diverse MPRA datasets, we characterize technical variability sources including barcode sequence bias, outlier barcodes, and delivery method (episomal vs. lentiviral). Our results establish best practices for MPRA data generation and analysis, facilitating robust, reproducible research and large-scale integration. The presented tools and standards are publicly available, providing a foundation for future collaborative efforts in regulatory genomics.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。