UROPA: a tool for Universal RObust Peak Annotation

Sci Rep. 2017 Jun 1;7(1):2593. doi: 10.1038/s41598-017-02464-y.

Abstract

The annotation of genomic ranges of interest represents a recurring task for bioinformatics analyses. These ranges can originate from various sources, including peaks called for transcription factor binding sites (TFBS) or histone modification ChIP-seq experiments, chromatin structure and accessibility experiments (such as ATAC-seq), but also from other types of predictions that result in genomic ranges. While peak annotation primarily driven by ChiP-seq was extensively explored, many approaches remain simplistic ("most closely located TSS"), rely on fixed pre-built references, or require complex scripting tasks on behalf of the user. An adaptable, fast, and universal tool, capable to annotate genomic ranges in the respective biological context is critically missing. UROPA (Universal RObust Peak Annotator) is a command line based tool, intended for universal genomic range annotation. Based on a configuration file, different target features can be prioritized with multiple integrated queries. These can be sensitive for feature type, distance, strand specificity, feature attributes (e.g. protein_coding) or anchor position relative to the feature. UROPA can incorporate reference annotation files (GTF) from different sources (Gencode, Ensembl, RefSeq), as well as custom reference annotation files. Statistics and plots transparently summarize the annotation process. UROPA is implemented in Python and R.

MeSH terms

  • Animals
  • Computational Biology*
  • Genome
  • Genomics*
  • Humans
  • Molecular Sequence Annotation*
  • Software*