GC Content in Nuclear-Encoded Genes and Effective Number of Codons (ENC) Are Positively Correlated in AT-Rich Species and Negatively Correlated in GC-Rich Species

在富含AT的物种中,核编码基因的GC含量与有效密码子数(ENC)呈正相关;而在富含GC的物种中,二者呈负相关。

阅读:1

Abstract

BACKGROUND/OBJECTIVES: Codon usage bias affects gene expression and translation efficiency across species. The effective number of codons (ENC) and GC content influence codon preference, often displaying unimodal or bimodal distributions. This study investigates the correlation between ENC and GC rankings across species and how their relationship affects codon usage distributions. METHODS: I analyzed nuclear-encoded genes from 17 species representing six kingdoms: one bacteria (Escherichia coli), three fungi (Saccharomyces cerevisiae, Neurospora crassa, and Schizosaccharomyces pombe), one archaea (Methanococcus aeolicus), three protists (Rickettsia hoogstraalii, Dictyostelium discoideum, and Plasmodium falciparum),), three plants (Musa acuminata, Oryza sativa, and Arabidopsis thaliana), and six animals (Anopheles gambiae, Apis mellifera, Polistes canadensis, Mus musculus, Homo sapiens, and Takifugu rubripes). Genes in all 17 species were ranked by GC content and ENC, and correlations were assessed. I examined how adding or subtracting these rankings influenced their overall distribution in a new method that I call Two-Rank Order Normalization or TRON. The equation, TRON = SUM(ABS((GC rank(1):GC rank(N)) - (ENC rank(1):ENC rank(N)))/(N(2)/3), where (GC rank(1):GC rank(N)) is a rank-order series of GC rank, (ENC rank(1):ENC rank(N)) is a rank-order series ENC rank, sorted by the rank-order series GC rank. The denominator of TRON, N(2)/3, is the normalization factor because it is the expected value of the sum of the absolute value of GC rank-ENC rank for all genes if GC rank and ENC rank are not correlated. RESULTS: ENC and GC rankings are positively correlated (i.e., ENC increases as GC increases) in AT-rich species such as honeybees (R(2) = 0.60, slope = 0.78) and wasps (R(2) = 0.52, slope = 0.72) and negatively correlated (i.e., ENC decreases as GC increases) in GC-rich species such as humans (R(2) = 0.38, slope = -0.61) and rice (R(2) = 0.59, slope = -0.77). Second, the GC rank-ENC rank distributions change from unimodal to bimodal as GC content increases in the 17 species. Third, the GC rank+ENC rank distributions change from bimodal to unimodal as GC content increases in the 17 species. Fourth, the slopes of the correlations (GC versus ENC) in all 17 species are negatively correlated with TRON (R(2) = 0.98) (see Graphic Abstract). CONCLUSIONS: The correlation between ENC rank and GC rank differs among species, shaping codon usage distributions in opposite ways depending on whether a species' nuclear-encoded genes are AT-rich or GC-rich. Understanding these patterns might provide insights into translation efficiency, epigenetics mediated by CpG DNA methylation, epitranscriptomics of RNA modifications, RNA secondary structures, evolutionary pressures, and potential applications in genetic engineering and biotechnology.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。