Lost in the space of bioinformatic tools: a constantly updated survival guide for genetic epidemiology. The GenEpi Toolbox

Atherosclerosis. 2010 Apr;209(2):321-35. doi: 10.1016/j.atherosclerosis.2009.10.026. Epub 2009 Oct 29.

Abstract

Genome-wide association studies (GWASs) led to impressive advances in the elucidation of genetic factors underlying complex phenotypes and diseases. However, the ability of GWAS to identify new susceptibility loci in a hypothesis-free approach requires tools to quickly retrieve comprehensive information about a genomic region and analyze the potential effects of coding and non-coding SNPs in a candidate gene region. Furthermore, once a candidate region is chosen for resequencing and fine-mapping studies, the identification of several rare mutations is likely and requires strong bioinformatic support to properly evaluate and prioritize the found mutations for further analysis. Due to the variety of regulatory layers that can be affected by a mutation, a comprehensive in-silico evaluation of candidate SNPs can be a demanding and very time-consuming task. Although many bioinformatic tools that significantly simplify this task were made available in the last years, their utility is often still unknown to researches not intensively involved in bioinformatics. We present a comprehensive guide of 64 tools and databases to bioinformatically analyze gene regions of interest to predict SNP effects. In addition, we discuss tools to perform data mining of large genetic regions, predict the presence of regulatory elements, make in-silico evaluations of SNPs effects and address issues ranging from interactome analysis to graphically annotated proteins sequences. Finally, we exemplify the use of these tools by applying them to hits of a recently performed GWAS. Taken together a combination of the discussed tools are summarized and constantly updated in the web-based "GenEpi Toolbox" (http://genepi_toolbox.i-med.ac.at) and can help to get a glimpse at the potential functional relevance of both large genetic regions and single nucleotide mutations which might help to prioritize the next steps.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Computational Biology*
  • Data Mining / methods
  • Databases, Genetic
  • Decision Trees
  • Gene Expression Regulation
  • Genome, Human*
  • Genome-Wide Association Study*
  • Humans
  • MicroRNAs / physiology
  • Molecular Epidemiology / methods*
  • Polymorphism, Single Nucleotide*
  • Software

Substances

  • MicroRNAs