MOTIVATION: Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. RESULTS: We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. AVAILABILITY AND IMPLEMENTATION: The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). CONTACT: yueljiang@163.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.
阅读:3
作者:Lin Jie, Adjeroh Donald A, Jiang Bing-Hua, Jiang Yue
| 期刊: | Bioinformatics | 影响因子: | 5.400 |
| 时间: | 2018 | 起止号: | 2018 May 15; 34(10):1682-1689 |
| doi: | 10.1093/bioinformatics/btx809 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
