Abstract
BACKGROUND: Comprehensive genetic characterization and screening for congenital adrenal hyperplasia (CAH) have not yet been achieved at the population level because of the complexity of the CYP21A2 locus. This prospective study incorporated long-read sequencing (LRS) into the current first-tier biochemical newborn screening (NBS) to comprehensively characterize the variant spectrum of CYP21A2, fully investigate the carrier frequency and expected incidence of classic and non-classic CAH (NCCAH), and evaluate the clinical feasibility of genetic NBS for CAH. METHODS: A total of 21,239 newborns were consecutively recruited from 11 centers across China between June 2023 and May 2024. All the participants underwent biochemical and genetic NBS. In vitro enzymatic activity and minigene assays were performed to determine the pathogenicity of novel variants. A 30.8-kb long amplicon, followed by LRS, was performed to determine the phasing of duplication chimera and single-nucleotide variations (SNVs) and indels in CYP21A2. RESULTS: Eligible genetic screening results were obtained for 21,234 (99.98%) newborns. The allele frequencies of duplications and deletions at the CYP21A2 locus were 4.51% and 0.15%, respectively. In vitro functional analysis and LRS-based phasing were performed to precisely determine carrier alleles, setting an overall frequency of 1.67% (711/42468, 95% confidence interval (CI): 1.55–1.80%), with 0.75% (320/424268, 95% CI: 0.67–0.84%) and 0.92% (391/42468, 95% CI: 0.83–1.01%) for classic and NC carriers, respectively. Notably, hotspot variants including SNVs/indels caused by microgene conversion and 30-kb deletions caused by unequal crossover accounted for 84.0% (597/711), whereas rare variants comprised as high as 16.0% (114/711) of all variants. The expected incidence of classic CAH according to allele frequency was 1/17613. The expected incidence of NCCAH in Chinese population (1/4474) was significantly lower than that in US Ashkenazi Jews (1/133) and Caucasians (1/337), mainly owing to the different allele frequencies of the NC variant CYP21A2:c.844G > T. Biochemical NBS identified 106 (0.50%) positive samples with a positive predictive value of 0.94% (1/106). LRS accurately identified the one case of classic CAH, with no false positives. CONCLUSIONS: Our findings provide a population-level carrier frequency and incidence estimates with a comprehensive landscape of the CYP21A2 locus, and demonstrate the effectiveness of first-tier LRS-based genetic NBS for CAH. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13073-025-01594-7.