Functional characterization of transcription factor motifs using cross-species comparison across large evolutionary distances

PLoS Comput Biol. 2010 Jan 29;6(1):e1000652. doi: 10.1371/journal.pcbi.1000652.

Abstract

We address the problem of finding statistically significant associations between cis-regulatory motifs and functional gene sets, in order to understand the biological roles of transcription factors. We develop a computational framework for this task, whose features include a new statistical score for motif scanning, the use of different scores for predicting targets of different motifs, and new ways to deal with redundancies among significant motif-function associations. This framework is applied to the recently sequenced genome of the jewel wasp, Nasonia vitripennis, making use of the existing knowledge of motifs and gene annotations in another insect genome, that of the fruitfly. The framework uses cross-species comparison to improve the specificity of its predictions, and does so without relying upon non-coding sequence alignment. It is therefore well suited for comparative genomics across large evolutionary divergences, where existing alignment-based methods are not applicable. We also apply the framework to find motifs associated with socially regulated gene sets in the honeybee, Apis mellifera, using comparisons with Nasonia, a solitary species, to identify honeybee-specific associations.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Animals
  • Bees / genetics
  • Behavior, Animal
  • Computational Biology / methods*
  • Conserved Sequence* / genetics
  • Conserved Sequence* / physiology
  • Drosophila melanogaster / genetics
  • Genome, Insect
  • Insect Proteins / genetics
  • Models, Genetic
  • Molecular Sequence Data
  • Species Specificity
  • Transcription Factors* / genetics
  • Transcription Factors* / physiology
  • Wasps / genetics

Substances

  • Insect Proteins
  • Transcription Factors