STREAM-PRS: a multi-tool pipeline for streamlining polygenic risk score computation

STREAM-PRS:用于简化多基因风险评分计算的多工具流程

阅读:1

Abstract

BACKGROUND: Polygenic risk scores (PRS) offer an elegant approach to estimating an individual's genetic predisposition to a given disease or trait. Numerous tools are available for PRS calculation, each applying different strategies to account for linkage disequilibrium and effect size shrinkage. No single tool is inherently superior. Therefore, multiple tools should be tested to identify the one that best suits the research question. Additionally, challenges such as population stratification and PRS portability further complicate the field. Here, we developed STREAM-PRS, a PRS pipeline designed to calculate scores using five popular tools: PRSice-2, PRS-CS, LDpred2, lassosum, and lassosum2. METHODS: STREAM-PRS first computes scores under various settings in a training dataset. The selected variants are subsequently used for score calculation in the test dataset, followed by PC correction and standardization to improve portability across different centers. Finally, the pipeline determines the best PRS tool and settings based on the variance explained (R(2)) in the test dataset. To demonstrate this PRS pipeline, we applied it to an in-house inflammatory bowel disease (IBD) cohort consisting of 3192 IBD cases and 822 controls. In total, 472 scores were created using The 1000 Genomes non-Finnish European subpopulation as training data and applied to UK Biobank data as the test dataset. RESULTS: Using STREAM-PRS for 472 scores across the 5 PRS tools with 404 individuals in the training and 1000 individuals in the test dataset takes approximately 20 h to complete. For IBD, lassosum was identified as the best-performing tool with optimal settings as follows: a shrinkage value of 0.7 and a lambda value of 0.008859. Applying this optimized PRS to our in-house IBD dataset (validation) resulted in an R(²) of 0.203 and an AUC of 0.75. Further, the PRS showed a high positive predictive value of 0.905 but a low negative predictive value of 0.341. This suggests that the PRS is effective in identifying individuals at high risk but might be less reliable in excluding lower risk individuals. CONCLUSIONS: Overall, STREAM-PRS provides an efficient framework for selecting the best PRS calculation strategy and helps bridge the portability gap within the PRS field. STREAM-PRS is available at https://github.com/SaraBecelaere/STREAM-PRS.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。