An artificial intelligence (AI) system trained to conduct medical interviews matched, and at times surpassed, human doctors’ performance at conversing with simulated patients and listing possible diagnoses based on a patient’s medical history. Does this mean ChatGPT may soon replace Dr. Silverman in the operating room?!?
The AI Chatbot used in the study was developed by Google based on a large language model. The model was fined tuned by inputting existing real-world data systems like electronic health records and transcribed medical conversations. Additionally, researchers prompted the system to respond to patients in the tone of an empathetic clinician who sought to understand a person’s health history and devise potential diagnoses. The model was also asked to evaluate a real-world doctor’s interaction with a patient in order to provide feedback on how to improve that interaction.
To put the AI Chatbot to the test, researchers enlisted 20 actors who had been trained to impersonate patients. These actors were instructed to have an online text-based consultation with both the AI system and one of 20 board-certified physicians. They were not told whether or not they were chatting with a real doctor or the AI system.
The actors simulated 149 clinical scenarios and were then asked to evaluate their experience. A pool of specialists also rated the performance of the AI system and the physicians.
Study Results
After looking at the results and feedback, researchers found that the AI system matched or surpassed the physicians’ diagnostic accuracy in all six medical specialties considered. Moreover, the chatbot outperformed physicians in 24 of 26 criteria for conversation quality, including in aspects like:
- Politeness
- Condition explanation
- Condition treatment
- Appearance of honesty
- Expressing care and commitment
Researchers noted that the chatbot also ended up being more accurate than board-certified primary-care physicians in diagnosing respiratory and cardiovascular conditions. The chatbot was able to acquire a similar amount of information as primary care physicians during text-based conversations, and it actually ranked higher on empathy compared to the real-world doctor.
Of course, the study had some limitations that inherently gave the chatbot an advantage. Most doctors likely have little experience having text-based chat conversations with their patients, and they don’t have the innate ability to create eloquent and well-structured text-based answers in short order given a prompt, which is par for the course for the chat system.
Although the chatbot is far from clinical use, researchers are hopeful that in the future it could offer assistance to real-world doctors. They don’t expect an AI system to completely replace interactions with a physician, because at the end of the day, medicine is more than just about collecting information and outputting a response. The team plans to conduct more-detailed studies to evaluate potential biases and to assess the ethical requirements for testing the system on real patients, not trained actors.