Differences in recording of cancer diagnosis between datasets in England: A population-based study of linked cancer registration, hospital, and primary care data

Cancer Epidemiol. 2024 Nov 28:94:102703. doi: 10.1016/j.canep.2024.102703. Online ahead of print.

Abstract

Background: Differences in the recording of cancer case status and diagnosis date have been observed between cancer registry (CR) - the reference standard - and electronic health records (EHRs); such differences may affect estimates of cancer risk or misclassify diagnostic pathways. This study aims to quantify differences in recording of case status and date of cancer diagnosis between cancer registry and EHRs.

Methods: Linked primary care (Clinical Practice Research Datalink (CPRD)), secondary care (Hospital Episode Statistics (HES)) and national Cancer Registry (CR) data, were used to identify 14,301 patients with a recorded diagnosis of brain, colon, lung, ovarian, or pancreatic cancer between 1999 and 2018. Agreement in case status between datasets, differences in recorded diagnosis dates, and change in agreement over time were investigated for each cancer site.

Results: Between 84 % (ovary) to 92 % (colon) of diagnoses in cancer registry were also recorded in combined CPRD-HES data. Agreement with cancer registry was slightly lower in HES (78 % (ovary) to 86 % (colon)) and CPRD (61 % (ovary, pancreas) to 72 % (brain)). The proportion of CPRD-HES diagnoses confirmed in CR varied by cancer site (50 % (brain) to 86 % (lung)). Agreement between CR and HES was relatively stable within cancer sites over time. Concordance between CR and CPRD was more heterogeneous between cancer sites and over time. Best agreement in diagnosis date was observed between CR and HES (median difference 0 or 1 days, all cancer sites).

Conclusion: Agreement between CR and EHR data is heterogeneous across cancer sites. Concordance does not appear to have improved over time. Combined data from primary and secondary care may be sufficient to approximate case status in CR in some circumstances, but the date we consider to represent the diagnosis may impact study outcomes.

Keywords: Medical records; Registry.