Identification of functional transcription factor binding sites using closely related Saccharomyces species

Genome Res. 2005 May;15(5):701-9. doi: 10.1101/gr.3578205. Epub 2005 Apr 18.

Abstract

Comparative genomics provides a rapid means of identifying functional DNA elements by their sequence conservation between species. Transcription factor binding sites (TFBSs) may constitute a significant fraction of these conserved sequences, but the annotation of specific TFBSs is complicated by the fact that these short, degenerate sequences may frequently be conserved by chance rather than functional constraint. To identify intergenic sequences that function as TFBSs, we calculated the probability of binding site conservation between Saccharomyces cerevisiae and its two closest relatives under a neutral model of evolution. We found that this probability is <5% for 134 of 163 transcription factor binding motifs, implying that we can reliably annotate binding sites for the majority of these transcription factors by conservation alone. Although our annotation relies on a number of assumptions, mutations in five of five conserved Ume6 binding sites and three of four conserved Ndt80 binding sites show Ume6- and Ndt80-dependent effects on gene expression. We also found that three of five unconserved Ndt80 binding sites show Ndt80-dependent effects on gene expression. Together these data imply that although sequence conservation can be reliably used to predict functional TFBSs, unconserved sequences might also make a significant contribution to a species' biology.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Motifs / genetics
  • Base Sequence
  • Binding Sites / genetics
  • Computational Biology
  • Conserved Sequence / genetics
  • DNA, Intergenic / genetics
  • DNA-Binding Proteins / genetics
  • DNA-Binding Proteins / metabolism*
  • Evolution, Molecular*
  • Gene Expression*
  • Genomics / methods
  • Models, Genetic*
  • Molecular Sequence Data
  • Mutation / genetics
  • Repressor Proteins / genetics
  • Repressor Proteins / metabolism*
  • Saccharomyces / genetics*
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism*
  • Sequence Alignment
  • Species Specificity
  • Transcription Factors / genetics
  • Transcription Factors / metabolism*

Substances

  • DNA, Intergenic
  • DNA-Binding Proteins
  • NDT80 protein, S cerevisiae
  • Repressor Proteins
  • Saccharomyces cerevisiae Proteins
  • Transcription Factors
  • UME6 protein, S cerevisiae