The P10K database: a data portal for the protist 10 000 genomes project

P10K数据库:原生生物10000基因组计划的数据门户

阅读:1

Abstract

Protists, a highly diverse group of microscopic eukaryotic organisms distinct from fungi, animals and plants, exert crucial roles within the earth's biosphere. However, the genomes of only a small fraction of known protist species have been published and made publicly accessible. To address this constraint, the Protist 10 000 Genomes Project (P10K) was initiated, implementing a specialized pipeline for single-cell genome/transcriptome assembly, decontamination and annotation of protists. The resultant P10K database (https://ngdc.cncb.ac.cn/p10k/) serves as a comprehensive platform, collating and disseminating genome sequences and annotations from diverse protist groups. Currently, the P10K database has incorporated 2959 genomes and transcriptomes, including 1101 newly sequenced datasets by P10K and 1858 publicly available datasets. Notably, it covers 45% of the protist orders, with a significant representation (53% coverage) of ciliates, featuring nearly a thousand genomes/transcriptomes. Intriguingly, analysis of the unique codon table usage among ciliates has revealed differences compared to the NCBI taxonomy system, suggesting a need to revise the codon tables used for these species. Collectively, the P10K database serves as a valuable repository of genetic resources for protist research and aims to expand its collection by incorporating more sequenced data and advanced analysis tools to benefit protist studies worldwide.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。