A mother-child data linkage approach using data from the information system for the development of research in primary care (SIDIAP) in Catalonia

J Biomed Inform. 2024 Nov:159:104747. doi: 10.1016/j.jbi.2024.104747. Epub 2024 Nov 6.

Abstract

Background: Large-scale clinical databases containing routinely collected electronic health records (EHRs) data are a valuable source of information for research studies. For example, they can be used in pharmacoepidemiology studies to evaluate the effects of maternal medication exposure on neonatal and pediatric outcomes. Yet, this type of studies is infeasible without proper mother-child linkage.

Methods: We leveraged all eligible active records (N = 8,553,321) of the Information System for Research in Primary Care (SIDIAP) database. Mothers and infants were linked using a deterministic approach and linkage accuracy was evaluated in terms of the number of records from candidate mothers that failed to link. We validated the mother-child links identified by comparison of linked and unlinked records for both candidate mothers and descendants. Differences across these two groups were evaluated by means of effect size calculations instead of p-values. Overall, we described our data linkage process following the GUidance for Information about Linking Data sets (GUILD) principles.

Results: We were able to identify 744,763 unique mother-child relationships, linking 83.8 % candidate mothers with delivery dates within a period of 15 years. Of note, we provide a record-level category label used to derive a global confidence metric for the presented linkage process. Our validation analysis showed that the two groups were similar in terms of a number of aggregated attributes.

Conclusions: Complementing the SIDIAP database with mother-child links will allow clinical researchers to expand their epidemiologic studies with the ultimate goal of improving outcomes for pregnant women and their children. Importantly, the reported information at each step of the data linkage process will contribute to the validity of analyses and interpretation of results in future studies using this resource.

Keywords: Electronic health records; Epidemiology; Health informatics; Medical information systems; Primary care; Public healthcare.

MeSH terms

  • Adult
  • Databases, Factual
  • Electronic Health Records*
  • Female
  • Humans
  • Infant
  • Infant, Newborn
  • Medical Record Linkage* / methods
  • Mother-Child Relations
  • Mothers / statistics & numerical data
  • Pregnancy
  • Primary Health Care*
  • Spain