Population-scale Long-read Sequencing in the All of Us Research Program

“我们所有人”研究计划中的群体规模长读长测序

阅读:1

Abstract

The All of Us Research Program (AoU) is a national biobank seeking to enroll one million individuals in the United States to link genomic and biomedical data, including short- and long-read whole-genome sequencing (srWGS/LRS), with rich electronic health record (EHR) information. Here, we present the first large-scale analyses of long-read sequencing (LRS) in AoU and offer a new framework for deriving genomic insights into complex structural variation (SV) of relevance to human health and disease. We performed joint analyses of 1,027 individuals self-identifying as Black or African American, sequenced to ~8x coverage with Pacific Biosciences HiFi technology and processed using cloud-native pipelines. From these LRS data we constructed a comprehensive variant callset encompassing known (FMR1 and HTT) and novel repeat expansions, clinically relevant haplotypes at loci inaccessible to srWGS, and haplotypes relevant to disease risk (HLA) and pharmacogenomics (CYP2D6), including SNVs, indels, and SVs. We developed methods for cohort-level variant calling and a scalable workflow to impute >750,000 of these SVs into existing srWGS datasets for trait association and human disease studies. Expanding to 10,000 self-identified Black or African American AoU participants with srWGS and matched EHRs, we identified 291 SV-disease associations (p < 1×10(-5)) spanning 226 conditions with 50.9% of associations involving SVs absent from the matched srWGS callset. Across the 226 traits, after fine-mapping using SVs and SNVs we identified 191 SV-disease pairs spanning 160 traits (70.8%) where the SV had the strongest association within the locus. Associations specific to those with computed ancestry similar to the African reference population exhibited larger effect sizes and lower allele frequencies, consistent with high-risk, ancestry-specific variants. These results demonstrate that the integration of LRS into AoU and future biobank initiatives can provide transformative new insights into genomic variation with potentially profound impact on precision medicine.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。