Large Language Models and Large Multimodal Models in Medical Imaging: A Primer for Physicians

Tyler J Bradshaw; Xin Tie; Joshua Warner; Junjie Hu; Quanzheng Li; Xiang Li

doi:10.2967/jnumed.124.268072

Large Language Models and Large Multimodal Models in Medical Imaging: A Primer for Physicians

J Nucl Med. 2025 Jan 16:jnumed.124.268072. doi: 10.2967/jnumed.124.268072. Online ahead of print.

Authors

Tyler J Bradshaw¹, Xin Tie², Joshua Warner², Junjie Hu³, Quanzheng Li⁴, Xiang Li⁴

Affiliations

¹ Department of Radiology, University of Wisconsin-Madison, Madison, Wisconsin; tbradshaw@wisc.edu.
² Department of Radiology, University of Wisconsin-Madison, Madison, Wisconsin.
³ Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin; and.
⁴ Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts.

PMID: 39819692
DOI: 10.2967/jnumed.124.268072

Abstract

Large language models (LLMs) are poised to have a disruptive impact on health care. Numerous studies have demonstrated promising applications of LLMs in medical imaging, and this number will grow as LLMs further evolve into large multimodal models (LMMs) capable of processing both text and images. Given the substantial roles that LLMs and LMMs will have in health care, it is important for physicians to understand the underlying principles of these technologies so they can use them more effectively and responsibly and help guide their development. This article explains the key concepts behind the development and application of LLMs, including token embeddings, transformer networks, self-supervised pretraining, fine-tuning, and others. It also describes the technical process of creating LMMs and discusses use cases for both LLMs and LMMs in medical imaging.

Keywords: artificial intelligence; computer/PACS; educational; large language models; machine learning; statistics.