The potential of chatbots in chronic venous disease patient management

JVS Vasc Insights. 2023:1:100019. doi: 10.1016/j.jvsvi.2023.100019. Epub 2023 Jun 19.

Abstract

Objective: Health care providers and recipients have been using artificial intelligence and its subfields, such as natural language processing and machine learning technologies, in the form of search engines to obtain medical information for some time now. Although a search engine returns a ranked list of webpages in response to a query and allows the user to obtain information from those links directly, ChatGPT has elevated the interface between humans with artificial intelligence by attempting to provide relevant information in a human-like textual conversation. This technology is being adopted rapidly and has enormous potential to impact various aspects of health care, including patient education, research, scientific writing, pre-visit/post-visit queries, documentation assistance, and more. The objective of this study is to assess whether chatbots could assist with answering patient questions and electronic health record inbox management.

Methods: We devised two questionnaires: (1) administrative and non-complex medical questions (based on actual inbox questions); and (2) complex medical questions on the topic of chronic venous disease. We graded the performance of publicly available chatbots regarding their potential to assist with electronic health record inbox management. The study was graded by an internist and a vascular medicine specialist independently.

Results: On administrative and non-complex medical questions, ChatGPT 4.0 performed better than ChatGPT 3.5. ChatGPT 4.0 received a grade of 1 on all the questions: 20 of 20 (100%). ChatGPT 3.5 received a grade of 1 on 14 of 20 questions (70%), grade 2 on 4 of 16 questions (20%), grade 3 on 0 questions (0%), and grade 4 on 2/20 questions (10%). On complex medical questions, ChatGPT 4.0 performed the best. ChatGPT 4.0 received a grade of 1 on 15 of 20 questions (75%), grade 2 on 2 of 20 questions (10%), grade 3 on 2 of 20 questions (10%), and grade 4 on 1 of 20 questions (5%). ChatGPT 3.5 received a grade of 1 on 9 of 20 questions (45%), grade 2 on 4 of 20 questions (20%), grade 3 on 4 of 20 questions (20%), and grade 4 on 3 of 20 questions (15%). Clinical Camel received a grade of 1 on 0 of 20 questions (0%), grade 2 on 5 of 20 questions (25%), grade 3 on 5 of 20 questions (25%), and grade 4 on 10 of 20 questions (50%).

Conclusions: Based on our interactions with ChatGPT regarding the topic of chronic venous disease, it is plausible that in the future, this technology may be used to assist with electronic health record inbox management and offload medical staff. However, for this technology to receive regulatory approval to be used for that purpose, it will require extensive supervised training by subject experts, have guardrails to prevent "hallucinations" and maintain confidentiality, and prove that it can perform at a level comparable to (if not better than) humans. (JVS-Vascular Insights 2023;1:100019.).

Keywords: Artificial intelligence; ChatGPT 3.5; ChatGPT 4.0; Chronic venous disease; Clinical Camel; Electronic health record inbox management; Generative AI.