Semiconductor sequencing: how many flows do you need?

Bioinformatics. 2015 Apr 15;31(8):1199-203. doi: 10.1093/bioinformatics/btu805. Epub 2014 Dec 4.

Abstract

Motivation: Semiconductor sequencing directly translates chemically encoded information (A, C, G or T) into voltage signals that are detected by a semiconductor device. Changes of pH value and thereby of the electric potential in the reaction well are detected during strand synthesis from nucleotides provided in cyclic repeated flows for each type of nucleotide. To minimize time requirement and costs, it is necessary to know the number of flows that are required for complete coverage of the templates.

Results: We calculate the number of required flows in a random sequence model and present exact expressions for cumulative distribution function, expected value and variance. Additionally, we provide an algorithm to calculate the number of required flows for a concrete list of amplicons using a BED file of genomic positions as input. We apply the algorithm to calculate the number of flows that are required to cover six amplicon panels that are used for targeted sequencing in cancer research. The upper bounds obtained for the number of flows allow to enhance the instrument throughput from two chips to three chips per day for four of these panels.

Availability and implementation: The algorithm for calculation of the flows was implemented in R and is available as package ionflows from the CRAN repository.

Contact: jan.budczies@charite.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Genome, Human*
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Neoplasms / genetics*
  • Semiconductors
  • Sequence Analysis, DNA / methods*
  • Software*