Purpose: This study aims to evaluate the similarity, readability, and alignment with current scientific knowledge of responses from AI-based chatbots to common questions about epilepsy and physical exercise.
Methods: Four AI chatbots (ChatGPT-3.5,ChatGPT 4, Google Gemini, and Microsoft Copilot) were evaluated. Fourteen questions on epilepsy and physical exercise were designed to compare the platforms. Lexical similarity, response patterns, and thematic content were analyzed. Readability was measured using the Flesch Reading Ease and Flesch-Kincaid Grade Level scores. Seven experts rated the quality of responses on a Likert scale from "very poor" to "very good."
Results: The responses showed lexical similarity, with approaches to physical exercise ranging from conservative to holistic. Microsoft Copilot scored the highest on the Flesch Reading Ease scale (48.42 ± 13.71), while ChatGPT-3.5 scored the lowest (23.84 ± 8.19). All responses were generally rated as difficult to read. Quality ratings ranged from "Good" to "Acceptable," with ChatGPT 4 being the preferred platform, chosen by 48.98 % of reviewers.
Conclusion: The findings highlight the potential of AI chatbots as useful sources of information on epilepsy and physical exercise. However, simplifying language and tailoring content to user's needs is essential to enhance their effectiveness.
Keywords: AI Chatbots; Artificial Intelligence; Epilepsy; Health Information; Physical Exercise.
Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.