Haplotype-resolved genome assemblies of BJ and IMR-90 human fibroblast cell lines reveal extensive structural variation and enable reanalysis of historical sequencing data

BJ 和 IMR-90 人类成纤维细胞系的单倍型解析基因组组装揭示了广泛的结构变异,并能够对历史测序数据进行重新分析。

阅读:1

Abstract

We present chromosome-level, phased diploid genome assemblies of two widely used human fibroblast cell lines: BJ (46,XY) and IMR-90 (46,XX). Using Oxford Nanopore, PacBio HiFi, and Hi-C sequencing data, we generated assemblies spanning 5.9 and 6.0 Gbp with diploid quality values exceeding QV 60. To validate structural integrity, we developed KaryoScope, an alignment-free tool for generating computational karyograms from k-mer feature databases. We identify >50 000 structural variants relative to T2T-CHM13v2.0, the majority of which are heterozygous and cell-line-specific. Combining reference-based and de novo gene annotation, we uncover a previously unreported 1 Mbp homozygous duplication at the 16p11.2 locus in BJ, demonstrating that even karyotypically normal cell lines can harbor clinically relevant submicroscopic rearrangements. We show that mapping publicly available short-read, RNA-seq, and ChIP-seq data to sample-matched diploid assemblies substantially improves read alignment and enables haplotype phasing of 23%-28% of short reads. The BJ and IMR-90 assemblies and associated variant calls are publicly available as a resource for the research community.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。