BOOGIE: Predicting Blood Groups from High Throughput Sequencing Data

PLoS One. 2015 Apr 20;10(4):e0124579. doi: 10.1371/journal.pone.0124579. eCollection 2015.

Abstract

Over the last decade, we have witnessed an incredible growth in the amount of available genotype data due to high throughput sequencing (HTS) techniques. This information may be used to predict phenotypes of medical relevance, and pave the way towards personalized medicine. Blood phenotypes (e.g. ABO and Rh) are a purely genetic trait that has been extensively studied for decades, with currently over thirty known blood groups. Given the public availability of blood group data, it is of interest to predict these phenotypes from HTS data which may translate into more accurate blood typing in clinical practice. Here we propose BOOGIE, a fast predictor for the inference of blood groups from single nucleotide variant (SNV) databases. We focus on the prediction of thirty blood groups ranging from the well known ABO and Rh, to the less studied Junior or Diego. BOOGIE correctly predicted the blood group with 94% accuracy for the Personal Genome Project whole genome profiles where good quality SNV annotation was available. Additionally, our tool produces a high quality haplotype phase, which is of interest in the context of ethnicity-specific polymorphisms or traits. The versatility and simplicity of the analysis make it easily interpretable and allow easy extension of the protocol towards other phenotypes. BOOGIE can be downloaded from URL http://protein.bio.unipd.it/download/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • ABO Blood-Group System / genetics
  • Blood Group Antigens / genetics*
  • Exons / genetics
  • Genome, Human
  • Haplotypes / genetics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Molecular Sequence Annotation
  • Mutation / genetics
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics
  • Software*

Substances

  • ABO Blood-Group System
  • Blood Group Antigens

Grants and funding

GM is an AIRC research fellow. This work was also supported by Italian Ministry of Health grant GR-2011-02346845 and FIRB Futuro in Ricerca grant RBFR08ZSXY to ST. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.