Addressing bias in preterm birth research: The role of advanced imputation techniques for missing race and ethnicity in perinatal health data

Ann Epidemiol. 2024 Jun:94:120-126. doi: 10.1016/j.annepidem.2024.05.003. Epub 2024 May 10.

Abstract

Objectives: To evaluate the effectiveness of Bayesian Improved Surname Geocoding (BISG) and Bayesian Improved First Name Surname Geocoding (BIFSG) in estimating race and ethnicity, and how they influence odds ratios for preterm birth.

Methods: We analyzed hospital birth admission electronic health records (EHR) data (N = 9985). We created two simulation sets with 40 % of race and ethnicity data missing randomly or more likely for non-Hispanic black birthing people who had preterm birth. We calculated C-statistics to evaluate how accurately BISG and BIFSG estimate race and ethnicity. We examined the association between race and ethnicity and preterm birth using logistic regression and reported odds ratios (OR).

Results: BISG and BIFSG showed high accuracy for most racial and ethnic categories (C-statistics = 0.94-0.97, 95 % confidence intervals [CI] = 0.92-0.97). When race and ethnicity were not missing at random, BISG (OR = 1.25, CI = 0.97-1.62) and BIFSG (OR = 1.38, CI = 1.08-1.76) resulted in positive estimates mirroring the true association (OR = 1.68, CI = 1.34-2.09) for Non-Hispanic Black birthing people, while traditional methods showed contrasting estimates (Complete case OR = 0.62, CI = 0.41-0.94; multiple imputation OR = 0.63, CI = 0.40-0.98).

Conclusions: BISG and BIFSG accurately estimate missing race and ethnicity in perinatal EHR data, decreasing bias in preterm birth research, and are recommended over traditional methods to reduce potential bias.

Keywords: Electronic health records; Health disparities; Missing data; Preterm birth; Race and ethnicity.

MeSH terms

  • Adult
  • Bayes Theorem*
  • Bias*
  • Black or African American
  • Electronic Health Records*
  • Ethnicity* / statistics & numerical data
  • Female
  • Humans
  • Infant, Newborn
  • Perinatal Care / statistics & numerical data
  • Pregnancy
  • Premature Birth* / ethnology
  • Racial Groups / statistics & numerical data