Impact of Professional Background on Inter-Annotator Variability and Accuracy During Annotation of Clinical Notes

Stud Health Technol Inform. 2023 May 2:301:248-253. doi: 10.3233/SHTI230048.

Abstract

Background: The aging population's need for treatment of chronic diseases is exhibiting a marked increase in urgency, with heart failure being one of the most severe diseases in this regard. To improve outpatient care of these patients and reduce hospitalization rates, the telemedical disease management program HerzMobil was developed in the past.

Objective: This work aims to analyze the inter-annotator variability among two professional groups (healthcare and engineering) involved in this program's annotation process of free-text clinical notes using categories.

Methods: A dataset of 1,300 text snippets was annotated by 13 annotators with different backgrounds. Inter-annotator variability and accuracy were evaluated using the F1-score and analyzed for differences between categories, annotators, and their professional backgrounds.

Results: The results show a significant difference between note categories concerning inter-annotator variability (p<0.0001) and accuracy (p<0.0001). However, there was no statistically significant difference between the two annotator groups, neither concerning inter-annotator variability (p=0.15) nor accuracy (p=0.84).

Conclusion: Professional background had no significant impact on the annotation of free-text HerzMobil notes.

Keywords: Austria; Electronic Health Records; Heart Failure; Natural Language Processing; Telemedicine.

MeSH terms

  • Aged
  • Austria
  • Electronic Health Records*
  • Heart Failure* / therapy
  • Hospitalization
  • Humans
  • Natural Language Processing*