Mapping biomedical concepts onto the human genome by mining literature on chromosomal aberrations

Nucleic Acids Res. 2007;35(8):2533-43. doi: 10.1093/nar/gkm054. Epub 2007 Apr 1.

Abstract

Biomedical literature provides a rich but unstructured source of associations between chromosomal regions and biomedical concepts. By mining MEDLINE abstracts, we annotate the human genome at the level of cytogenetic bands. Our method creates a set of chromosomal aberration maps that associate cytogenetic bands to biomedical concepts from a variety of controlled vocabularies, including disease, dysmorphology, anatomy, development and Gene Ontology branches. The association between a band (e.g. 4p16.3) and a concept (e.g. microcephaly) is assessed by the statistical overrepresentation of this concept in the abstracts relating to this band. Our method is validated using existing genome annotation resources and known chromosomal aberration maps and is further illustrated through a case study on heart disease. Our chromosomal aberration maps provide diagnostics support to clinical geneticists, aid cytogeneticists to interpret and report cytogenetic findings and support researchers interested in human gene function. The method is available as a web application, aBandApart, at http://www.esat.kuleuven.be/abandapart/.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Chromosome Aberrations*
  • Chromosome Banding*
  • Chromosome Disorders / genetics*
  • Chromosome Mapping / methods*
  • Congenital Abnormalities / genetics
  • Genetic Predisposition to Disease
  • Genome, Human*
  • Heart Diseases / genetics
  • Humans
  • Internet
  • MEDLINE*
  • Software
  • Vocabulary, Controlled