African Pan Genome Contigs Expose Biologically Relevant Sequence Still Hidden from Human Reference Frameworks

非洲泛基因组序列揭示了人类参考框架中仍隐藏的具有生物学意义的序列

阅读:1

Abstract

Human reference genomes underpin biomedical discovery but remain incomplete and biased toward European populations, constraining interpretation of genetic variation in underrepresented populations. Here we characterize African Pan Genome (APG) contigs totaling 296.5 Mb to define the sequence and functional landscape of genomic regions absent from current references. Most contigs align to the telomere-to-telomere (T2T-CHM13) genome and across 47 haplotype-resolved Human Pangenome Reference Consortium (HPRC) assemblies, with T2T-CHM13 placements enriched in centromeric and satellite repeats and overlapping 373 genes, including disease-associated loci. Mapping across HPRC assemblies revealed ancestry-associated contig enrichment, particularly in African genomes. Notably, 742 contigs remained unmapped under both stringent and relaxed criteria. These sequences are largely nonrepetitive and exhibit strong functional potential, including predicted protein-coding genes, CpG islands and transcriptional activity. Together, these results demonstrate that functionally relevant, ancestry-enriched genomic sequences remain absent from current references, with important implications for disease variant interpretation and precision medicine.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。