The performance of genetic-constraint metrics varies significantly across the human noncoding genome

遗传约束指标在人类非编码基因组中的表现差异显著

阅读:1

Abstract

A longstanding goal in human genetics is to prioritize noncoding loci that, when disrupted, lead to developmental disorders and other Mendelian traits. In pursuit of this goal, multiple metrics have been developed to distinguish neutrally evolving sequences from those subjected to purifying selection. These metrics are commonly evaluated genome-wide, e.g., by computing a precision-recall curve on windows tiling the entire noncoding genome. Here, we identify parts of the noncoding genome where these metrics significantly underperform relative to their genome-wide performance due to "bias" in the underlying models of neutral genetic variation and/or a low "signal-to-noise ratio" in the genetic data. The most extreme effects are found for Gnocchi (Chen et al. 2024), the performance of which declines as GC content increases. We suggest annotating constraint scores of noncoding genomic intervals with robust measures of the bias of the corresponding model, allowing users to gauge confidence in those scores.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。