Development of a predictive model for retention in HIV care using natural language processing of clinical notes

J Am Med Inform Assoc. 2021 Jan 15;28(1):104-112. doi: 10.1093/jamia/ocaa220.

Abstract

Objective: Adherence to a treatment plan from HIV-positive patients is necessary to decrease their mortality and improve their quality of life, however some patients display poor appointment adherence and become lost to follow-up (LTFU). We applied natural language processing (NLP) to analyze indications towards or against LTFU in HIV-positive patients' notes.

Materials and methods: Unstructured lemmatized notes were labeled with an LTFU or Retained status using a 183-day threshold. An NLP and supervised machine learning system with a linear model and elastic net regularization was trained to predict this status. Prevalence of characteristics domains in the learned model weights were evaluated.

Results: We analyzed 838 LTFU vs 2964 Retained notes and obtained a weighted F1 mean of 0.912 via nested cross-validation; another experiment with notes from the same patients in both classes showed substantially lower metrics. "Comorbidities" were associated with LTFU through, for instance, "HCV" (hepatitis C virus) and likewise "Good adherence" with Retained, represented with "Well on ART" (antiretroviral therapy).

Discussion: Mentions of mental health disorders and substance use were associated with disparate retention outcomes, however history vs active use was not investigated. There remains further need to model transitions between LTFU and being retained in care over time.

Conclusion: We provided an important step for the future development of a model that could eventually help to identify patients who are at risk for falling out of care and to analyze which characteristics could be factors for this. Further research is needed to enhance this method with structured electronic medical record fields.

Keywords: HIV; lost to follow-up; machine learning; natural language processing; retention in care.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Electronic Health Records*
  • Female
  • HIV Infections / therapy*
  • Humans
  • Lost to Follow-Up
  • Male
  • Models, Theoretical
  • Natural Language Processing*
  • Patient Compliance*
  • Retention in Care*