JoGo 1.0: the ACTG hierarchical nomenclature and database covering 4.7 million haplotypes across 19,194 human genes

JoGo 1.0:ACTG 分层命名法和数据库,涵盖 19,194 个人类基因中的 470 万个单倍型

阅读:1

Abstract

The Joint Open Genome and Omics Platform 1.0 (JoGo) is a global, long-read-based human haplotype database covering 19 194 MANE-standardized protein-coding genes. JoGo introduces a novel ACTG hierarchical nomenclature-A (amino acid), C (coding), T (transcript), and G (gene body)-that assigns numeric identifiers in descending order of global frequency. Using high-fidelity long-read sequencing, we assembled haplotype-resolved contigs for 258 globally sampled genomes, including 108 sequenced in-house. We cataloged 174 376 A-, 300 610 C-, 486 288 T-, and 3 695 204 G-level haplotypes (4 656 478 in total). Haplotype IDs are assigned once globally across all sequences, including those originating from GRCh38 and CHM13v2 reference assemblies, embedding reference haplotypes within the same frequency-ranked space and enabling direct cross-assembly comparison. JoGo maps functional variants from ClinVar, GWAS Catalog, and GTEx onto their corresponding ACTG-haplotypes and provides haplotype-expression QTL results from 1280 HapMap RNA-seq samples across three independent studies. The web portal provides flexible search by gene name, variant ID, or ACTG code. It offers both an interactive online viewer and a privacy-preserving local viewer for secure integration with user data. JoGo enables high-resolution exploration of haplotype diversity, facilitating the identification of functional variants relevant to gene regulation, disease associations, and precision medicine. JoGo 1.0 is freely accessible at https://jogo.csml.org.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。