Predicting bacteriophage infection of specific bacterial strains promises advancements in phage therapy and microbial ecology. Whether the dynamics of well-established phage-host model systems generalize to the wide diversity of microbes is currently unknown. Here we show that we could accurately predict the outcomes of phage-bacteria interactions at the strain level in natural isolates from the genus Escherichia using only genomic data (area under the receiver operating characteristic curve (AUROC) of 86%). We experimentally established a dataset of interactions between 403 diverse Escherichia strains and 96 phages. Most interactions are explained by adsorption factors as opposed to antiphage systems which play a marginal role. We trained predictive algorithms and pinpoint poorly predicted interactions to direct future research efforts. Finally, we established a pipeline to recommend tailored phage cocktails, demonstrating efficiency on 100 pathogenic E. coli isolates. This work provides quantitative insights into phage-host specificity and supports the use of predictive algorithms in phage therapy.
© 2024. The Author(s), under exclusive licence to Springer Nature Limited.