PAMPHLET: PAM prediction HomoLogous-Enhancement toolkit for precise PAM prediction in CRISPR-Cas systems

J Genet Genomics. 2024 Nov 8:S1673-8527(24)00288-1. doi: 10.1016/j.jgg.2024.10.014. Online ahead of print.

Abstract

The CRISPR-Cas technology has revolutionized our ability to understand and engineer organisms, evolving from a singular Cas9 model to a diverse CRISPR toolbox. A critical bottleneck in developing new Cas proteins is identifying protospacer adjacent motif (PAM) sequences. Due to the limitations of experimental methods, bioinformatics approaches have become essential. However, existing PAM prediction programs are limited by the small number of spacers in CRISPR-Cas systems, resulting in low accuracy. To address this, we develop PAMPHLET, a novel pipeline that uses homology searches to identify additional spacers, significantly increasing the number of spacers up to 18-fold. PAMPHLET is validated on 20 CRISPR-Cas systems and successfully predicts PAM sequences for 18 protospacers. These predictions are further validated using the DocMF platform, which characterizes protein-DNA recognition patterns via next-generation sequencing. The high consistency between PAMPHLET predictions and DocMF results for novel Cas proteins demonstrates potential of PAMPHLET to enhance PAM sequence prediction accuracy, expedite the discovery process, and accelerate the development of CRISPR tools.

Keywords: CRISPR-Cas; Computational framework; Genome editing; PAM prediction; Protospacer adjacent motif.