Abstract
India is the most populous country globally, yet genetic studies involving Indian individuals remain limited. The Indian population is composed of many founder groups and has a mixed genetic ancestry, including an ancestral component not observed anywhere outside of India. This presents a unique opportunity to uncover novel disease variants and develop tailored medical interventions. To facilitate genetic research in India, a crucial first step is to create a foundational resource that serves as a benchmark for future population studies and methods development. Thus, we constructed the largest and most nationally representative linkage disequilibrium (LD) and genotype imputation reference panels in India to date, using high-coverage whole-genome sequencing data of 2,680 participants from the Longitudinal Aging Study in India-Harmonized Diagnostic Assessment of Dementia (LASI-DAD). As an LD reference panel, LASI-DAD includes 69.5 million variants, representing 170% and 213% increases relative to the 1000 Genomes Project and TOP-LD South Asian panels, respectively. Besides serving as an LD lookup panel, LASI-DAD facilitates various statistical analyses relying on precise LD estimates. In polygenic risk score (PRS) analyses, LASI-DAD improved the PRS predictive performance by 2.1%-35.1% across traits and studies. As an imputation reference panel, LASI-DAD enhanced imputation accuracy, measured by the Pearson correlation between imputed and true genotypes, by 3%-101% (mean 38%) compared with the TOPMed panel and by 3%-73% (mean 27%) compared with the Genome Asia Pilot panel across different allele frequencies. The LASI-DAD reference panel is publicly available to benefit future studies.