The site frequency spectrum (SFS) is a popular summary statistic of genomic data. While the SFS of a constant-sized population undergoing neutral mutations has been extensively studied in population genetics, the rapidly growing amount of cancer genomic data has attracted interest in the spectrum of an exponentially growing population. Recent theoretical results have generally dealt with special or limiting cases, such as considering only cells with an infinite line of descent, assuming deterministic tumor growth, or taking large-time or large-population limits. In this work, we derive exact expressions for the expected SFS of a cell population that evolves according to a stochastic branching process, first for cells with an infinite line of descent and then for the total population, evaluated either at a fixed time (fixed-time spectrum) or at the stochastic time at which the population reaches a certain size (fixed-size spectrum). We find that while the rate of mutation scales the SFS of the total population linearly, the rates of cell birth and cell death change the shape of the spectrum at the small-frequency end, inducing a transition between a 1/j2 power-law spectrum and a 1/j spectrum as cell viability decreases. We show that this insight can in principle be used to estimate the ratio between the rate of cell death and cell birth, as well as the mutation rate, using the site frequency spectrum alone. Although the discussion is framed in terms of tumor dynamics, our results apply to any exponentially growing population of individuals undergoing neutral mutations.
Keywords: Branching processes; Cancer evolution; Exponentially growing populations; Infinite sites model; Mathematical modeling; Site frequency spectrum.
Copyright © 2021 Elsevier Inc. All rights reserved.