Visualization and interpretation of protein networks in Mycobacterium tuberculosis based on hierarchical clustering of genome-wide functional linkage maps

Nucleic Acids Res. 2003 Dec 15;31(24):7099-109. doi: 10.1093/nar/gkg924.

Abstract

Genome-wide functional linkages among proteins in cellular complexes and metabolic pathways can be inferred from high throughput experimentation, such as DNA microarrays, or from bioinformatic analyses. Here we describe a method for the visualization and interpretation of genome-wide functional linkages inferred by the Rosetta Stone, Phylogenetic Profile, Operon and Conserved Gene Neighbor computational methods. This method involves the construction of a genome-wide functional linkage map, where each significant functional linkage between a pair of proteins is displayed on a two-dimensional scatter-plot, organized according to the order of genes along the chromosome. Subsequent hierarchical clustering of the map reveals clusters of genes with similar functional linkage profiles and facilitates the inference of protein function and the discovery of functionally linked gene clusters throughout the genome. We illustrate this method by applying it to the genome of the pathogenic bacterium Mycobacterium tuberculosis, assigning cellular functions to previously uncharacterized proteins involved in cell wall biosynthesis, signal transduction, chaperone activity, energy metabolism and polysaccharide biosynthesis.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism*
  • Cell Wall / metabolism
  • Computational Biology / methods*
  • Conserved Sequence / genetics
  • Genes, Bacterial / genetics
  • Genome, Bacterial*
  • Multigene Family / genetics*
  • Mycobacterium tuberculosis / cytology
  • Mycobacterium tuberculosis / genetics*
  • Mycobacterium tuberculosis / metabolism*
  • Operon / genetics
  • Phylogeny
  • Protein Binding
  • Proteome / genetics
  • Proteome / metabolism
  • Proteomics
  • Reproducibility of Results
  • Signal Transduction
  • Software

Substances

  • Bacterial Proteins
  • Proteome