Identifying the basal angiosperm node in chloroplast genome phylogenies: sampling one's way out of the Felsenstein zone

Mol Biol Evol. 2005 Oct;22(10):1948-63. doi: 10.1093/molbev/msi191. Epub 2005 Jun 8.

Abstract

While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea, and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxon sampling. The added taxa include three monocots (Acorus, Yucca, and Typha), a water lily (Nuphar), a ranunculid (Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein data sets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiosperm phylogeny. However, their relative positions proved to be dependent on the method of analysis, with parsimony favoring Amborella as sister to all other angiosperms and maximum likelihood (ML) and neighbor-joining methods favoring an Amborella + Nymphaeales clade as sister. The ML phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single-gene phylogenies, estimated divergence dates, and conflicting indel characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiosperm phylogeny. Molecular dating analyses provided median age estimates of 161 MYA for the most recent common ancestor (MRCA) of all extant angiosperms and 145 MYA for the MRCA of monocots, magnoliids, and eudicots. Whereas long sequences reduce variance in branch lengths and molecular dating estimates, the impact of improved taxon sampling on the rooting of the angiosperm phylogeny together with the results of parametric bootstrap analyses demonstrate how long-branch attraction might mislead genome-scale phylogenetic analyses.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Chloroplasts / genetics*
  • Codon / genetics
  • DNA Transposable Elements
  • DNA, Plant / genetics
  • DNA, Plant / isolation & purification
  • Evolution, Molecular
  • Genome, Plant
  • Magnoliopsida / classification
  • Magnoliopsida / genetics*
  • Phylogeny*
  • Sequence Deletion

Substances

  • Codon
  • DNA Transposable Elements
  • DNA, Plant