A Survey of Clinicians' Views of the Utility of Large Language Models

Appl Clin Inform. 2024 Mar;15(2):306-312. doi: 10.1055/a-2281-7092. Epub 2024 Mar 5.

Abstract

Objectives: Large language models (LLMs) like Generative pre-trained transformer (ChatGPT) are powerful algorithms that have been shown to produce human-like text from input data. Several potential clinical applications of this technology have been proposed and evaluated by biomedical informatics experts. However, few have surveyed health care providers for their opinions about whether the technology is fit for use.

Methods: We distributed a validated mixed-methods survey to gauge practicing clinicians' comfort with LLMs for a breadth of tasks in clinical practice, research, and education, which were selected from the literature.

Results: A total of 30 clinicians fully completed the survey. Of the 23 tasks, 16 were rated positively by more than 50% of the respondents. Based on our qualitative analysis, health care providers considered LLMs to have excellent synthesis skills and efficiency. However, our respondents had concerns that LLMs could generate false information and propagate training data bias.Our survey respondents were most comfortable with scenarios that allow LLMs to function in an assistive role, like a physician extender or trainee.

Conclusion: In a mixed-methods survey of clinicians about LLM use, health care providers were encouraging of having LLMs in health care for many tasks, and especially in assistive roles. There is a need for continued human-centered development of both LLMs and artificial intelligence in general.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Health Facilities
  • Health Personnel
  • Humans
  • Language