Average genome size: a potential source of bias in comparative metagenomics

ISME J. 2010 Aug;4(8):1075-7. doi: 10.1038/ismej.2010.29. Epub 2010 Mar 25.

Abstract

In gene-centric comparative metagenomics, differences in observed relative gene abundances among samples are often assumed to reflect the biological importance of individual genes in different habitats. Statistical tests and data mining for genes that represent habitat-specific adaptations are frequently based on this measure. We demonstrate that this measure is biased by the average genome size of the communities sampled. Average genome sizes can be estimated from the metagenomic data themselves, and taken into account in comparative analyses. We suggest that this would enable ecologically more meaningful comparisons, especially when the average genome sizes of compared communities differ substantially. We illustrate the influence of average genome-size differences on comparative analyses, with an example to highlight the need for further exploration of this bias.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / chemistry
  • Bacteria / genetics*
  • Genome, Bacterial*
  • Metagenomics*