Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance

人类蛋白质编码基因、其蛋白质产物及其遗传相关性相关出版物的频率模式数据集

阅读:1

Abstract

We present data concerning the distribution of scientific publications for human protein-coding genes together with their protein products and genetic relevance. We annotated the gene2pubmed dataset Maglott et al., 2007 provided by the NCBI (National Center for Biotechnology Information) with publication years, genetic metadata corresponding to Online Mendelian Inheritance in Man (OMIM) Hamosh et al., 2005 entries and the frequency of their appearance in Genome-Wide Association Studies (GWAS) Buniello et al., 2019 provided by the European Bioinformatics Institute (EBI) using the KNIME(®) Analytics Platform Berthold et al., 2008. The results of this data integration process comprise two datasets: 1) A dataset containing information on all human protein-coding genes that can be used to analyse the number of scientific publications in context of the potential disease relevance of the individual genes. 2) A table with the annual and cumulated number of PubMed entries. For further interpretation of the data presented in this article, please see the research article 'Target 2035 - probing the human proteome' by Carter et al. https://doi.org/10.1016/j.drudis.2019.06.020 Carter et al., 2019.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。