Rapid Targeted Assembly of the Proteome Reveals Evolutionary Variation of GC Content in Avian Lice

快速靶向组装蛋白质组揭示禽虱GC含量的进化变异

阅读:1

Abstract

Nucleotide base composition plays an influential role in the molecular mechanisms involved in gene function, phenotype, and amino acid composition. GC content (proportion of guanine and cytosine in DNA sequences) shows a high level of variation within and among species. Many studies measure GC content in a small number of genes, which may not be representative of genome-wide GC variation. One challenge when assembling extensive genomic data sets for these studies is the significant amount of resources (monetary and computational) associated with data processing, and many bioinformatic tools have not been optimized for resource efficiency. Using a high-performance computing (HPC) cluster, we manipulated resources provided to the targeted gene assembly program, automated target restricted assembly method (aTRAM), to determine an optimum way to run the program to maximize resource use. Using our optimum assembly approach, we assembled and measured GC content of all of the protein-coding genes of a diverse group of parasitic feather lice. Of the 499 426 genes assembled across 57 species, feather lice were GC-poor (mean GC = 42.96%) with a significant amount of variation within and between species (GC range = 19.57%-73.33%). We found a significant correlation between GC content and standard deviation per taxon for overall GC and GC(3), which could indicate selection for G and C nucleotides in some species. Phylogenetic signal of GC content was detected in both GC and GC(3). This research provides a large-scale investigation of GC content in parasitic lice laying the foundation for understanding the basis of variation in base composition across species.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。