Improving the Transferability of Clinical Note Section Classification Models with BERT and Large Language Model Ensembles

Proc Conf Assoc Comput Linguist Meet. 2023 Jul:2023:125-130.

Abstract

Text in electronic health records is organized into sections, and classifying those sections into section categories is useful for downstream tasks. In this work, we attempt to improve the transferability of section classification models by combining the dataset-specific knowledge in supervised learning models with the world knowledge inside large language models (LLMs). Surprisingly, we find that zero-shot LLMs out-perform supervised BERT-based models applied to out-of-domain data. We also find that their strengths are synergistic, so that a simple ensemble technique leads to additional performance gains.