Abstract
Ribosomal DNA (rDNA) encodes the precursor transcripts for ribosomal RNAs (rRNAs), which are processed into the structural and catalytic components of the ribosome, making them indispensable for protein synthesis and cell viability. Uniquely, the transcribed human rDNA locus is exceptionally GC-rich, a feature that promotes the formation of non-canonical DNA structures (NCS) such as R-loops, G-quadruplexes (G4s), and i-motifs (iMs). While previous studies have reported NCS in specific regions of human rDNA, there is no comprehensive map of their distribution across the entire human rDNA sequence. Here, we use validated computational tools to systematically identify predicted NCS sequences (PNCSS) across the human rDNA locus. Our analyses reveal that R-loop-, G4-, and iM-forming sequences are non-randomly distributed in the rDNA. These PNCSS are enriched in non-coding spacer regions, including 5' external transcriber spacer (5'ETS), internal transcriber spacers (ITS1 and ITS2), and the 3'ETS. PNCSS are also enriched in specific subdomains of the 28S coding region, while they are strikingly depleted from the 18S region. These motifs exhibit strong strand asymmetry, frequent co-localization, and evolutionarily conserved enrichment across vertebrate species. Notably, regions enriched for PNCSS are inversely correlated with RNA polymerase I (Pol I) occupancy, suggesting these structures might impede transcription and serve regulatory or quality control functions. Together, our findings define a coherent and conserved non-canonical structure architecture within the human rDNA locus. These PNCSS represent genomic hotspots for structural elements that regulate rDNA biology and represent targetable features for therapeutic intervention.