AI Chatbots: From Exams to Real-World Challenges

A study highlights the limitations of AI chatbots in real-world medical interactions, despite their success in medical exam scenarios. Researchers from Harvard and Stanford tested large-language models using a framework called 'CRAFT-MD' and found issues in symptom gathering, patient history collection, and diagnostic accuracy.


Devdiscourse News Desk | New Delhi | Updated: 03-01-2025 17:03 IST | Created: 03-01-2025 17:03 IST
AI Chatbots: From Exams to Real-World Challenges
This image is AI-generated and does not depict any real-life event or location. It is a fictional representation created for illustrative purposes only.
  • Country:
  • India

AI chatbots have become increasingly relied upon for making sense of symptoms or test results. However, recent research indicates that while these AI tools excel in medical exam-like settings, they face challenges in real-world conversations.

Published in Nature Medicine, the study suggests that large-language models (LLMs), which power chatbots like ChatGPT, should be rigorously evaluated before their clinical use. Developed by researchers at Harvard Medical School and Stanford University, the framework 'CRAFT-MD' analyzed LLMs such as GPT-4 for their performance in realistic patient interactions.

The study reveals limitations in chatbots' abilities to conduct coherent clinical conversations, gather complete patient histories, and accurately diagnose conditions. Senior author Pranav Rajpurkar of Harvard Medical School noted that despite their prowess in exams, these models struggle significantly with dynamic, real-world medical dialogues.

(With inputs from agencies.)

Give Feedback