Detection and annotation of unique regions in mammalian genomes

哺乳动物基因组中独特区域的检测与注释

阅读:1

Abstract

Long unique genomic regions have been reported to be highly enriched for developmental genes in mice and humans. In this paper, we identify unique genomic regions using an efficient method based on fast string matching. We quantify the resource consumption and accuracy of this method before applying it to the genomes of 18 mammals. We annotate their unique regions (URs) of at least 10 kb and find that they are strongly enriched for developmental genes across the board. We then investigated the subset of URs that lack annotations, which we call "anonymous." The longest anonymous UR in the Tasmanian devil spanned 83 kb and contained the gene encoding inositol polyphosphate-5-phosphatase A, which is an essential part of intracellular signaling. This discovery of an essential gene in a UR implies that URs might be given priority when annotating mammalian genomes. Our documented pipeline for annotating URs in any mammalian genome is available from the repository github.com/evolbioinf/auger; the additional data for this study are available from the dataverse at doi.org/10.17617/3.4IKQAG.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。