GenomeFLTR: filtering reads made easy

Edo Dotan; Michael Alburquerque; Elya Wygoda; Dorothée Huchon; Tal Pupko

doi:10.1093/nar/gkad410

GenomeFLTR: filtering reads made easy

Nucleic Acids Res. 2023 Jul 5;51(W1):W232-W236. doi: 10.1093/nar/gkad410.

Authors

Edo Dotan¹, Michael Alburquerque¹, Elya Wygoda¹, Dorothée Huchon^{2

3}, Tal Pupko¹

Affiliations

¹ The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
² School of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
³ The Steinhardt Museum of Natural History, Israel National Center for Biodiversity Studies, Tel-Aviv University, Tel Aviv 69978, Israel.

Abstract

In the last decade, advances in sequencing technology have led to an exponential increase in genomic data. These new data have dramatically changed our understanding of the evolution and function of genes and genomes. Despite improvements in sequencing technologies, identifying contaminated reads remains a complex task for many research groups. Here, we introduce GenomeFLTR, a new web server to filter contaminated reads. Reads are compared against existing sequence databases from various representative organisms to detect potential contaminants. The main features implemented in GenomeFLTR are: (i) automated updating of the relevant databases; (ii) fast comparison of each read against the database; (iii) the ability to create user-specified databases; (iv) a user-friendly interactive dashboard to investigate the origin and frequency of the contaminations; (v) the generation of a contamination-free file. Availability: https://genomefltr.tau.ac.il/.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Databases, Nucleic Acid
Genome / genetics
Genomics*
High-Throughput Nucleotide Sequencing*
Sequence Analysis, DNA
Software