larch: mapping the parsimony-optimal landscape of trees for directed exploration

落叶松:绘制树木简约最优景观图以进行定向探索

阅读:1

Abstract

Phylogenetic inference algorithms for large data sets typically return a single tree. However, there are often many optimal trees, especially when sequence data is closely related. We develop a compact representation of large collections of maximally parsimonious histories-trees with mutations mapped onto tree edges. Our C++ implementation, larch, leverages this representation for a highly parallel search algorithm. The storage component uses our history DAG structure to compactly represent large families of optimal trees. The search algorithm integrates this storage with matOptimize for rapid tree optimization; the DAG structure allows us to accept thousands of conflicting tree rearrangements in parallel. The integration enables a new type of tree search: one that systematically maps out the collection of good trees, enabling moves that are directed away from the current set of optimal trees to cross valleys and increase the diversity of the set of optimal trees. It is able to identify more parsimonious trees than are found by other methods. We find diverse optimality landscapes for viral datasets, including many distinct plateaux. We also find that our implementation produces similar results whether using a variety of single starting trees or an ensemble of starting trees, indicating effective global optimization.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。