Abstract
Parkinson's disease (PD) is a complex neurodegenerative disorder with environmental and genetic influences. Using genotyping array data of 661 South African PD cases and 737 controls, we conduct a polygenic risk score (PRS) analysis using PRSice-2. Summary statistics from two PD association studies have been used as base datasets. We split the target dataset into training (70%; n = 979) and validation (30%; n = 419) cohorts. We test various clumping window sizes, linkage disequilibrium thresholds, and p-value thresholds to determine the optimal combination for risk prediction. Additionally, we investigate the variance explained by different combinations of covariates. Overall, we observe modest predictive performance (AUC: 0.5847-0.6183). Age at recruitment emerges as the strongest individual predictor, while sex contributes the least. These findings provide the first evaluation of PRS performance for PD in a highly admixed South African cohort, underscoring the importance of including underrepresented populations in genetic risk prediction. By systematically assessing predictive performance across two base datasets, we highlight how ancestry composition and study design affect risk estimation in diverse populations. This work lays a foundation for refining genomic prediction in admixed populations and contributes to ongoing efforts to ensure that advances in precision medicine are globally relevant.