StarPhase: Comprehensive Phase-Aware Pharmacogenomic Diplotyper for Long-Read Sequencing Data

bioRxiv [Preprint]. 2024 Dec 11:2024.12.10.627527. doi: 10.1101/2024.12.10.627527.

Abstract

Pharmacogenomics is central to precision medicine, informing medication safety and efficacy. Pharmacogenomic diplotyping of complex genes requires full-length DNA sequences and detection of structural rearrangements. We introduce StarPhase, a tool that leverages PacBio HiFi sequence data to diplotype 21 CPIC Level A pharmacogenes and provides detailed haplotypes and supporting visualizations for HLA-A, HLA-B, and CYP2D6. StarPhase diplotypes have high concordance with benchmarks where 99.5% are either exact matches or minor discrepancies. Manual inspection of the 0.5% mismatches indicates they were correctly called by StarPhase. With StarPhase, we update or correct 26.2% of GeT-RM pharmacogenomic diplotypes. Population distributions from StarPhase mostly reflect those of the All of Us cohort, while also highlighting gaps in existing pharmacogenomic databases that long-read sequencing can fill. With a single HiFi whole genome sequencing assay, StarPhase enables robust PGx diplotyping even as additional pharmacogenes and haplotypes are discovered.

Publication types

  • Preprint