Abstract
Phylogenomics pipelines are designed to reconstruct evolutionary relationships among groups of organisms. Existing pipelines are dependent upon reference gene sets for which target copies may be retrieved through read-mapping. This read-mapping approach is limited by the availability of reference orthologs closely related to target taxa, which reduces its utility for nonmodel organisms. We introduce OrthoGarden, an automated and containerized de novo assembly-based phylogenomics pipeline aimed to recover accurate and reproducible phylogenies from any combination of short reads and assemblies with particular emphasis on nonmodel taxa. OrthoGarden is tested using 3 datasets of varying size, scope, and taxonomic identity and benchmarked against other phylogenomics pipelines for accuracy. When closely related reference orthologs are available, OrthoGarden produces phylogenies with comparable accuracy to existing pipelines; however, studies limited to distantly related reference orthologs yield increased accuracy using OrthoGarden relative to other mapping approaches. OrthoGarden produces highly accurate phylogenies across a wide range of taxa. Automated phylogenetic reconstruction using genes recovered through all-vs-all orthology inference among selected taxa allows for phylogenomic analysis without requiring in-group reference orthologs. Datasets using nonmodel taxa especially benefit from OrthoGarden's efficacy in the absence of a closely related reference group. Its consistent accuracy, automated usage of computational resources, and ability to utilize both short reads and assemblies make OrthoGarden a community-focused pipeline for both model and nonmodel phylogenomics. OrthoGarden is publicly available at github.com/jacksonhturner/orthogarden.