Abstract
BACKGROUND: Haplotype phasing plays an essential role in the analysis of variants associated with autosomal recessive disorders, particularly in verifying compound heterozygosity. Recently, Genome Aggregation Database (gnomAD) introduced a population-based method to infer cis or trans configurations using variant co-occurrence information; however, its routine clinical utility remains unclear. RESULTS: A retrospective review was performed on 2,783 targeted gene panel tests conducted at Asan Medical Center, Korea, from 2017 to 2024. Through this analysis, 299 unrelated patients were identified as carrying at least two reportable variants—classified as variants of uncertain significance (VUS), likely pathogenic (LP), or pathogenic (P)—within the same autosomal recessive gene. In total, this resulted in the identification of 342 unique variant pairs among these patients. Of the variant pairs, 48.5% (n = 166) were represented in the overall gnomAD dataset, but only 37.1% (n = 127) appeared in the East Asian subset. Trans predictions were observed in 57.4% of VUS–VUS pairs (50 of 87), 87.5% of VUS–P/LP pairs (28 of 32), and 98.7% of P/LP–P/LP pairs (46 of 47), highlighting a strong link between variant pathogenicity and trans prediction. Although allele frequency did not differ between trans and cis predictions, increasing inter-variant distance showed a weak yet significant positive association with a trans prediction (R² = 0.041, p = 0.00927). Despite focusing on genes classically linked to autosomal-recessive disease, six pairs were predicted cis across 12 patients, including one recurrent pair in seven individuals. Several cis-predicted configurations aligned with known dominant or carrier-state phenotypes (e.g., GALC with adult-onset spastic paraparesis; GBA in Parkinson disease susceptibility), informing clinical interpretation. In a validation subset with confirmed phase (n = 27), gnomAD-based predictions were 96.3% concordant (26/27), correctly identifying all 18 trans cases and misclassifying one cis case. CONCLUSION: gnomAD-based phasing can be used as a cost-effective method for confirming compound heterozygosity, particularly in cases where familial testing is not available, but predictions should be interpreted in clinical context. Increasing the representation of underrepresented populations in reference datasets will further enhance the clinical utility of this approach by improving both data availability and reliability. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40246-025-00855-1.