pandasPGS: a Python package for easy retrieval of Polygenic Score Catalog data

pandasPGS:一个用于轻松检索多基因评分目录数据的 Python 包

阅读:1

Abstract

BACKGROUND: The Polygenic Score (PGS) Catalog is a public database dedicated to storing polygenic risk scores. To date, the database has included 5,022 polygenic risk scores associated with 656 different traits. Although the PGS Catalog offers an official resource representational state transfer (REST) application programming interface (API), there is no ready-made data client tailored for any specific programming language. Researchers are thus required to invest time in becoming familiar with the structure of the REST API and to implement a corresponding client in their programming language of choice to integrate PGS data into their analytical workflows. METHODS: In this work we introduce pandasPGS, a Python package that provides programmatic access to PGS Catalog data. After being called by the researcher, pandasPGS will automatically select the appropriate uniform resource locator (URL) and request the data based on the name and parameters of the called function, and merge the obtained pagination data. In addition, pandasPGS also provides further data pre-processing functions. According to the structure of the obtained data, it can convert the data into several hierarchical pandas.DataFrame objects, which is convenient for further analysis by researchers. RESULTS: This tool allows researchers to easily analyze PGS Catalog data using Python. It alleviates the time cost for researchers to learn the REST APIs of PGS Catalog. The source codes can be found in https://github.com/tianzelab/pandaspgs, and the API documentations can be found in https://tianzelab.github.io/pandaspgs/.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。