Introduction: OpenAI's GPT-4 (artificial intelligence [AI]) is being studied for its use as a medical decision support tool. This research examines its accuracy in refining referrals for fetal echocardiography (FE) to improve early detection and outcomes related to congenital heart defects (CHDs).
Methods: Past FE data referred to our institution were evaluated separately by pediatric cardiologist, gynecologist (human experts [experts]), and AI, according to established guidelines. We compared experts and AI's agreement on referral necessity, with experts addressing discrepancies.
Results: Total of 59 FE cases were addressed retrospectively. Cardiologist, gynecologist, and AI recommended performing FE in 47.5%, 49.2%, and 59.0% of cases, respectively. Comparing AI recommendations to experts indicated agreement of around 80.0% with both experts (p < 0.001). Notably, AI suggested more echocardiographies for minor CHD (64.7%) compared to experts (47.1%), and for major CHD, experts recommended performing FE in all cases (100%) while AI recommended in majority of cases (90.9%). Discrepancies between AI and experts are detailed and reviewed.
Conclusions: The evaluation found moderate agreement between AI and experts. Contextual misunderstandings and lack of specialized medical knowledge limit AI, necessitating clinical guideline guidance. Despite shortcomings, AI's referrals comprised 65% of minor CHD cases versus experts 47%, suggesting its potential as a cautious decision aid for clinicians.
Keywords: Artificial intelligence; Fetal cardiology; Fetal echocardiography.
© 2024 S. Karger AG, Basel.