Predicting the effects of SNPs on transcription factor binding affinity

Bioinformatics. 2020 Jan 15;36(2):364-372. doi: 10.1093/bioinformatics/btz612.

Abstract

Motivation: Genome-wide association studies have revealed that 88% of disease-associated single-nucleotide polymorphisms (SNPs) reside in noncoding regions. However, noncoding SNPs remain understudied, partly because they are challenging to prioritize for experimental validation. To address this deficiency, we developed the SNP effect matrix pipeline (SEMpl).

Results: SEMpl estimates transcription factor-binding affinity by observing differences in chromatin immunoprecipitation followed by deep sequencing signal intensity for SNPs within functional transcription factor-binding sites (TFBSs) genome-wide. By cataloging the effects of every possible mutation within the TFBS motif, SEMpl can predict the consequences of SNPs to transcription factor binding. This knowledge can be used to identify potential disease-causing regulatory loci.

Availability and implementation: SEMpl is available from https://github.com/Boyle-Lab/SEM_CPP.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Chromatin Immunoprecipitation
  • Genome-Wide Association Study*
  • Polymorphism, Single Nucleotide*
  • Protein Binding
  • Transcription Factors

Substances

  • Transcription Factors