CAGI4 SickKids clinical genomes challenge: A pipeline for identifying pathogenic variants

CAGI4 SickKids临床基因组挑战赛:致病变异识别流程

阅读:1

Abstract

Compared with earlier more restricted sequencing technologies, identification of rare disease variants using whole-genome sequence has the possibility of finding all causative variants, but issues of data quality and an overwhelming level of background variants complicate the analysis. The CAGI4 SickKids clinical genome challenge provided an opportunity to assess the landscape of variants found in a difficult set of 25 unsolved rare disease cases. To address the challenge, we developed a three-stage pipeline, first carefully analyzing data quality, then classifying high-quality gene-specific variants into seven categories, and finally examining each candidate variant for compatibility with the often complex phenotypes of these patients for final prioritization. Variants consistent with the phenotypes were found in 24 out of the 25 cases, and in a number of these, there are prioritized variants in multiple genes. Data quality analysis suggests that some of the selected variants are likely incorrect calls, complicating interpretation. The data providers followed up on three suggested variants with Sanger sequencing, and in one case, a prioritized variant was confirmed as likely causative by the referring physician, providing a diagnosis in a previously intractable case.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。