Temperate phages (prophages) are ubiquitous in nature and persist as dormant components of host cells (lysogenic stage) before activating and lysing the host (lytic stage). Actively replicating prophages contribute to central community processes, such as enabling bacterial virulence, manipulating biogeochemical cycling, and driving microbial community diversification. Recent advances in sequencing technology have allowed for the identification and characterization of diverse phages, yet no approaches currently exist for identifying if a prophage has activated. Here, we present PropagAtE (Prophage Activity Estimator), an automated software tool for estimating if a prophage is in the lytic or lysogenic stage of infection. PropagAtE uses statistical analyses of prophage-to-host read coverage ratios to decipher actively replicating prophages, irrespective of whether prophages were induced or spontaneously activated. We demonstrate that PropagAtE is fast, accurate, and sensitive, regardless of sequencing depth. Application of PropagAtE to prophages from 348 complex metagenomes from human gut, murine gut, and soil environments identified distinct spatial and temporal prophage activation signatures, with the highest proportion of active prophages in murine gut samples. In infants treated with antibiotics or infants without treatment, we identified active prophage populations correlated with specific treatment groups. Within time series samples from the human gut, 11 prophage populations, some encoding the sulfur metabolism gene cysH or a rhuM-like virulence factor, were consistently present over time but not active. Overall, PropagAtE will facilitate accurate representations of viruses in microbiomes by associating prophages with their active roles in shaping microbial communities in nature. IMPORTANCE Viruses that infect bacteria are key components of microbiomes and ecosystems. They can kill and manipulate microorganisms, drive planetary-scale processes and biogeochemical cycling, and influence the structures of entire food networks. Prophages are viruses that can exist in a dormant state within the genome of their host (lysogenic stage) before activating in order to replicate and kill the host (lytic stage). Recent advances have allowed for the identification of diverse viruses in nature, but no approaches exist for characterizing prophages and their stages of infection (prophage activity). We develop and benchmark an automated approach, PropagAtE, to identify the stages of infection of prophages from genomic data. We provide evidence that active prophages vary in identity and abundance across multiple environments and scales. Our approach will enable accurate and unbiased analyses of viruses in microbiomes and ecosystems.
Keywords: metagenome; microbiome; prophage; software; virus.