Abstract
Phylogenetics has a central role in evolutionary biology and genomic epidemiology(1). Assessing phylogenetic confidence and reliability is therefore crucial and the methods that do this, such as those derived from Felsenstein's bootstrap(2), are among the most widely used in modern science. However, these methods require enormous computational capacity, and are unsuitable for large datasets. Furthermore, most of these methods emerge from a focus on the membership of clades (groupings of taxa), which makes their results difficult to interpret in the context of genomic epidemiology. Here we propose subtree pruning and regrafting-based tree assessment (SPRTA), an efficient and interpretable approach to assess confidence in phylogenetic trees. SPRTA shifts the paradigm of phylogenetic support measurement from evaluating the confidence in clades to evolution histories and phylogenetic placement-for example, assessing whether a lineage evolved from another considered lineage, which is particularly valuable in genomic epidemiology. We use SPRTA to investigate a global public SARS-CoV-2 phylogenetic tree relating more than two million genomes, highlighting plausible alternative evolutionary origins of many SARS-CoV-2 variants, assessing reliability in the Pango outbreak lineage classification system(3), and demonstrating the effect of phylogenetic uncertainty on inferred mutation rates. Our results show that SPRTA enables pandemic-scale and detailed probabilistic assessment of transmission and mutational histories. Our method introduces a new approach to assessing phylogenetic confidence, enhancing the interpretability of pandemic-scale phylogenetic analyses and improving our ability to prepare for and respond to future pandemics.