The rates and routes of lethal systemic spread in breast cancer are poorly understood owing to a lack of molecularly characterized patient cohorts with long-term, detailed follow-up data. Long-term follow-up is especially important for those with oestrogen-receptor (ER)-positive breast cancers, which can recur up to two decades after initial diagnosis1-6. It is therefore essential to identify patients who have a high risk of late relapse7-9. Here we present a statistical framework that models distinct disease stages (locoregional recurrence, distant recurrence, breast-cancer-related death and death from other causes) and competing risks of mortality from breast cancer, while yielding individual risk-of-recurrence predictions. We apply this model to 3,240 patients with breast cancer, including 1,980 for whom molecular data are available, and delineate spatiotemporal patterns of relapse across different categories of molecular information (namely immunohistochemical subtypes; PAM50 subtypes, which are based on gene-expression patterns10,11; and integrative or IntClust subtypes, which are based on patterns of genomic copy-number alterations and gene expression12,13). We identify four late-recurring integrative subtypes, comprising about one quarter (26%) of tumours that are both positive for ER and negative for human epidermal growth factor receptor 2, each with characteristic tumour-driving alterations in genomic copy number and a high risk of recurrence (mean 47-62%) up to 20 years after diagnosis. We also define a subgroup of triple-negative breast cancers in which cancer rarely recurs after five years, and a separate subgroup in which patients remain at risk. Use of the integrative subtypes improves the prediction of late, distant relapse beyond what is possible with clinical covariates (nodal status, tumour size, tumour grade and immunohistochemical subtype). These findings highlight opportunities for improved patient stratification and biomarker-driven clinical trials.