Rethinking large scale phylogenomics with EukPhylo v1.0, a flexible toolkit to enable phylogeny-informed data curation and analyses of diverse eukaryotic lineages

利用 EukPhylo v1.0 重新思考大规模系统发育基因组学,这是一个灵活的工具包,可以实现基于系统发育信息的真核生物谱系数据管理和分析。

阅读:2

Abstract

Eukaryotic diversity is largely microbial, with macroscopic lineages (plant, animals and fungi) nesting among a plethora of diverse protists. Understanding the evolutionary relationships among eukaryotes is rapidly advancing through omics analyses, but phylogenomics are challenging for microeukaryotes, particularly uncultivable lineages, as single-cell sequencing approaches generate a mixture of sequences from hosts, associated microbiomes, and contaminants. Moreover, many analyses of eukaryotic gene families and phylogenies rely on boutique datasets and methods that are challenging for other research groups to replicate. To address these challenges, we present EukPhylo v1.0, a modular, user-friendly pipeline that enables effective data curation through phylogeny-informed contamination removal, estimation of homologous gene families (GFs), and generation of both multisequence alignments and gene trees. Analyses can use a hook database of ~15k ancient GFs or users can easily replace this hook with a set of gene families of interest. We demonstrate the power of EukPhylo, including a suite of stand-alone utilities, through analyses of 500 conserved GFs sampled from 1,000 diverse species of eukaryotes, bacteria and archaea. We show improvements in estimates of the eukaryotic tree of life, recovering clades that are well established in the literature, through successive rounds of curation using the EukPhylo contamination loop. The final trees corroborate numerous hypotheses in the literature (e.g. Opisthokonta, Rhizaria, Amoebozoa) while challenging others (e.g. CRuMs, Obazoa, Diaphoretickes). We believe that the flexibility and transparency of EukPhylo sets standards for curation of omics data for future studies.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。