Construction of Phylogenetic Relationships Based on 8-mer Spectra Distribution Characteristics of Vertebrate Whole Genome Sequences

基于脊椎动物全基因组序列8聚体光谱分布特征构建系统发育关系

阅读:1

Abstract

Background/Objectives: With advances in sequencing technology, whole genome sequences have become a valuable resource for deciphering species evolution. However, efficiently extracting phylogenetic information from such data remains a major challenge. Traditional multiple sequence alignment methods are computationally intensive and perform poorly for distantly related species, while k-mer analysis offers a new direction for efficiently capturing genomic composition and evolutionary signatures. Methods: Feature extraction based on 8-mer spectra from 16 XYi subsets. Results: This study found that the distribution characteristics of whole genome sequences 8-mer spectra are closely related to species evolution. Building on this, we developed a dual-feature strategy for genome-scale phylogenetics. The strategy incorporates two distinct feature types: (a) 186 class-level phylogenetic features (comprising 93 for separability and 93 for conservatism), identified from 8-mer spectrum distributions of 16 XYi subsets, which capture macroevolutionary patterns; and (b) order-level phylogenetic features, designated as rank information, which are generated by ranking all 65,536 8-mers by frequency based on the CGi subset's long-tail distribution and thereby capture microevolutionary patterns. Validation across vertebrate genomes confirmed that the class-level features establish the phylogenetic backbone, whereas the order-level features enable finer-resolution discrimination at the ordinal level. Conclusions: This study proposes a new method for constructing phylogenetic relationships at the genomic level.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。