Long-read sequencing resolves the clinically relevant CYP21A2 locus, supporting a new clinical test for Congenital Adrenal Hyperplasia

长读长测序解析了具有临床意义的CYP21A2基因位点,为先天性肾上腺皮质增生症的新型临床检测提供了支持。

阅读:1

Abstract

Congenital Adrenal Hyperplasia (CAH), one of the most common inherited disorders, is caused by defects in adrenal steroidogenesis. It is potentially lethal if untreated and is associated with multiple comorbidities, including fertility issues, obesity, insulin resistance, and dyslipidemia. CAH can result from variants in multiple genes, but the most frequent cause is deletions and conversions in the segmentally duplicated RCCX module, which contains the CYP21A2 gene and a pseudogene. The molecular genetic test to identify pathogenic alleles is cumbersome, incomplete, and available from a limited number of laboratories. It requires testing parents for accurate interpretation, leading to healthcare inequity. Less severe forms are frequently misdiagnosed, and phenotype/genotype correlations incompletely understood. We explored whether emerging technologies could be leveraged to identify all pathogenic alleles of CAH, including phasing in proband-only cases. We targeted long-read sequencing outputs that would be practical in a clinical laboratory setting. Both HiFi-based and nanopore-based whole-genome long-read sequencing datasets could be mined to accurately identify pathogenic single-nucleotide variants, full gene deletions, fusions creating non-functional hybrids between the gene and pseudogene ("30-kb deletion"), as well as count the number of RCCX modules and phase the resulting multimodular haplotypes. On the Hi-Fi data set of 6 samples, the PacBio Paraphase tool was able to distinguish nine different mono-, bi-, and tri-modular haplotypes, as well as the 30-kb and whole gene deletions. To do the same on the ONT-Nanopore dataset, we designed a tool, Parakit, which creates an enriched local pangenome to represent known haplotype assemblies and map ClinVar pathogenic variants and fusions onto them. With few labels in the region, optical genome mapping was not able to reliably resolve module counts or fusions, although designing a tool to mine the dataset specifically for this region may allow doing so in the future. Both sequencing techniques yielded congruent results, matching clinically identified variants, and offered additional information above the clinical test, including phasing, count of RCCX modules, and status of the other module genes, all of which may be of clinical relevance. Thus long-read sequencing could be used to identify variants causing multiple forms of CAH in a single test.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。