Generation of accurate, expandable phylogenomic trees with uDance

Nat Biotechnol. 2024 May;42(5):768-777. doi: 10.1038/s41587-023-01868-8. Epub 2023 Jul 27.

Abstract

Phylogenetic trees provide a framework for organizing evolutionary histories across the tree of life and aid downstream comparative analyses such as metagenomic identification. Methods that rely on single-marker genes such as 16S rRNA have produced trees of limited accuracy with hundreds of thousands of organisms, whereas methods that use genome-wide data are not scalable to large numbers of genomes. We introduce updating trees using divide-and-conquer (uDance), a method that enables updatable genome-wide inference using a divide-and-conquer strategy that refines different parts of the tree independently and can build off of existing trees, with high accuracy and scalability. With uDance, we infer a species tree of roughly 200,000 genomes using 387 marker genes, totaling 42.5 billion amino acid residues.

MeSH terms

  • Algorithms
  • Evolution, Molecular
  • Genomics / methods
  • Phylogeny*
  • Software