Abstract
Our understanding of rare genetic disorders (RGDs) comes largely from clinically ascertained individuals. Genomic-first ascertainment, however, can identify individuals with monogenic RGDs who were not ascertained clinically and enhance our understanding of the phenotypic spectrum, penetrance, and prevalence of RGDs. Although genomic ascertainment of RGDs at scale presents several challenges, it offers the potential for earlier and more precise diagnosis, improved management and treatment, and a more accurate description of the phenotypic spectrum of RGDs, which could all contribute to improved outcomes in RGDs. Therefore, we curated a list of 2,701 high-confidence, single-disorder-associated genes that are not routinely screened. Next, we created a sustainable strategy for identifying disease-causing variants across this gene list in 218,680 healthcare-population participants in Geisinger's MyCode Community Health Initiative. We developed and applied automated methods for assessing the fit of participants' genomic findings to existing clinical diagnoses. Our strategy identified 2.5% of participants (N = 5,484) with a high-confidence positive molecular finding in 490 RGD-associated genes. An additional 0.7% (N = 1,455) had possible molecular findings from compound-heterozygous or novel loss-of-function variants. Of the high-confidence molecular positives, 15.0%-21.1% had evidence of a corresponding clinical fit from existing diagnosis codes. The remainder lacked a corresponding clinical diagnosis code, suggesting that genomic ascertainment of RGDs may be more sensitive than clinical ascertainment and that penetrance for RGDs may be overestimated. This low rate of correspondence highlights the potential clinical value of a genomic-first approach to RGD ascertainment and the need for further population-based study of RGDs.