Abstract
Balanced Robertsonian translocation (ROB) is the most common chromosomal rearrangement in humans, with an estimated occurrence of 1 in 800 in newborn studies. Carriers are at increased risk of cancer and often diagnosed at fertility clinics after facing recurrent miscarriages, infertility, or aneuploid offspring. Genotyping carriers with DNA sequencing has been challenging because of gaps and misrepresentation of the translocation fusion site in the human reference genome. Only recently, telomere-to-telomere (T2T) human genomes successfully revealed sequences of the acrocentric short arms, including the most common ROB fusion site. A ROB results in loss of two ribosomal DNA (rDNA) arrays and its adjacent distal sequences, including the highly conserved distal junction (DJ). Here, we present a novel method to type ROB carriers directly from short sequencing reads by estimating DJ copy number. We demonstrate that our method successfully genotypes ROBs using a reference-free approach or alignments to either T2T-CHM13v2 or GRCh38. Applying the method to a cohort of healthy newborns and family members (n=4,172) as well as the UK Biobank (n=490,416), we find candidate ROBs at a frequency consistent with the previously reported 1 in 800 incidence (0.11-0.12%). In addition to ROB carriers, we report the frequency of one DJ loss (9, 2.8-3.4%) or gain (11+, 8.4-9.3%) from the two cohorts and the 1000 Genomes Project (n=3,202), and characterize the underlying structural variation in near-T2T genome assemblies from the Human Pangenome Reference Consortium. Importantly, our method provides the first sequencing-based diagnostic for Robertsonian chromosomes and can be applied to low-coverage sequencing data, enhancing its clinical applicability and enabling new studies of structural variation on the acrocentric chromosomes.