Inference of population mutation rate and detection of segregating sites from next-generation sequence data

Genetics. 2011 Oct;189(2):595-605. doi: 10.1534/genetics.111.130898. Epub 2011 Aug 11.

Abstract

We live in an age in which our ability to collect large amounts of genome-wide genetic variation data offers the promise of providing the key to the understanding and treatment of genetic diseases. Over the next few years this effort will be spearheaded by so-called next-generation sequencing technologies, which provide vast amounts of short-read sequence data at relatively low cost. This technology is often used to detect unknown variation in regions that have been linked with a given disease or phenotype. However, error rates are significant, leading to some nontrivial issues when it comes to interpreting the data. In this article, we present a method with which to address questions of widespread interest: calling variants and estimating the population mutation rate. We show performance of the method using simulation studies before applying our approach to an analysis of data from the 1000 Genomes project.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Genetic Predisposition to Disease / genetics
  • Genetic Variation
  • Genetics, Population
  • Genome, Human / genetics*
  • Genome-Wide Association Study / methods*
  • Genotype
  • Humans
  • Mutation Rate*
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA / methods*