Abstract
Rare genetic variation is considered a potential source of heritability in individuals with sporadic Alzheimer disease and related dementias (ADRD). The Variant-set test for association using annotation information (STAAR) framework leverages multiple functional annotations of genetic variants and combines association statistics from multiple variant aggregation-based methods, including burden, sequence kernel association test (SKAT), and aggregated Cauchy association test (ACAT-V), into a single measure of significance. Using whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP), we comprehensively examined the association of rare genetic variation with ADRD in 23,454 individuals (37% individuals affected by ADRD) and with cognitively healthy elder status in 13,292 individuals (13% cognitively healthy elders) from diverse populations via the STAAR framework. We identified several genes significantly associated with ADRD or cognitively healthy status. However, our analysis revealed several limitations within the STAAR framework incorporating ultra-rare variants with dichotomous outcomes. To enhance the robustness of the framework, we proposed several computational refinements, including creating a burden of ultra-rare variants and employing more precise annotations to match the expected mechanism. After implementing the proposed modifications, the association with ADRD for ZNF200 was no longer statistically significant (α = 1 × 10(-7)), while TBX19, PLXNB2, CARD11, and LINC01880 remained significantly associated with cognitively healthy status. We identified and addressed the computational limitations in the STAAR framework that could lead to potential spurious results for ultra-rare variant aggregates with an extremely low cumulative minor-allele count. Our proposed refinements produced more robust results for associations with rare variants in the context of dichotomous outcomes.