The past decade has seen extraordinary and rapid progress in the field of artificial intelligence (AI), which produces computer systems capable of performing tasks that typically require human intelligence. These advancements have yielded wide-ranging applications across various domains that are revolutionising industries and transforming the way humans live and work. Accordingly, potential applications of AI technology in the healthcare field are being investigated, such as deep learning for image analysis to detect lesions and provide diagnostic support as well as data mining and machine learning for analysing patient data to predict disease risks.1,2
Chat Generative Pre-trained Transformer (ChatGPT; OpenAI, San Francisco, USA) is an advanced conversational AI model that generates human-like responses based on a large dataset of text and can engage in interactive and contextually relevant conversations. This model has experienced remarkable growth since its public launch in November 2022, finding widespread use across online platforms and serving as a virtual assistant on devices and improving customer support. It has also become a valuable tool for language learners, offering conversational practice and feedback. Moreover, individuals are enjoying personal interactions with ChatGPT, seeking advice, brainstorming ideas, and enjoying its companionship. Given the widespread use of this technology, it is inevitable that patients will utilise it for gathering medical information. Accessing medical information via ChatGPT may make it easier for patients to self-inform and ask questions that they might find difficult to ask healthcare professionals directly. In addition, it has the potential to be valuable for healthcare providers, as it can help convey complex medical information in a more understandable manner. However, there is insufficient evidence thus far to confirm whether ChatGPT can provide accurate and useful information to patients.
In this issue of Annals, Koh et al. aimed to explore the quality of medical information provided and assess the opportunities and challenges of patient education using this natural-language AI model, focusing on the coronary angiogram procedure.3 The authors employed a conversational approach, posing common questions about coronary angiography to ChatGPT and evaluating its responses across different domains. The strengths of ChatGPT’s answers were evident. They were comprehensive, systematic and presented in plain language accessible to laypersons. Yet, the model still emphasised the importance of involving healthcare professionals in discussing individual circumstances and acknowledged its limitations in providing personalised recommendations.
Moreover, the study identified certain limitations in ChatGPT’s responses. Factual inaccuracies were infrequent but present, including confusion between antiplatelet and anticoagulation drugs, inaccurate indications and risks of angiography, and incorrect contraindications. Some significant omissions were also noted, such as the exclusion of active acute coronary syndromes as an indication for angiography. Inaccurate assumptions were also observed such as the suggestion of routine sedation during the procedure. Additionally, the model appeared inflexible in expanding recommendations beyond the specific line of questioning, limiting its ability to consider non-cardiac causes of symptoms. Given that the medical information provided to patients can potentially influence their decision-making, it is crucial not to overlook the issues associated with ChatGPT.
Previous studies have also reported that ChatGPT may provide inaccurate medical information. A recent study examined the appropriateness of AI model responses to questions based on guideline-based cardiovascular prevention topics and found errors in 16% (21 out of 25 questions) of ChatGPT’s responses.4 Another study evaluating the accuracy of ChatGPT in answering clinical questions based on the Japanese society of hypertension guidelines revealed an overall accuracy rate of only 64.5% for ChatGPT’s answers.5
The limitations of ChatGPT can be attributed to various factors. First, the inclusion of information from diverse internet sources introduces the possibility of presenting inaccurate information to users. Second, the need for probing and prompting for information that healthcare providers would typically provide during counselling highlights the limitations of the model in conveying comprehensive information. Such omissions may impact patients’ decision-making processes. Third, the conversational nature and scoping of the topic might restrict the model’s flexibility in addressing lateral aspects, especially when considering differential diagnoses. Furthermore, the model’s reliance on data inputs with cut-off dates may result in outdated or incomplete information, and thus effort should be directed towards improving the accuracy of information provided by ChatGPT. This may involve incorporating real-time medical updates and ensuring access to reliable, up-to-date sources.
In addition to the challenges revealed by Koh’s research, there are several issues to address before ChatGPT can be considered a useful tool for medical information-gathering for patients in the future. There is concern regarding the risk of data breaches and the leakage of personal information given by patients, and implementing robust security measures and data protection protocols would be crucial to minimise these risks and safeguard patient privacy. Additionally, ChatGPT’s lack of human-like emotions and ethical judgement poses challenges when dealing with important medical decisions or providing information tailored to individual situations.
In conclusion, ChatGPT demonstrates potential as an adjunct tool for patients to acquire health information, and understanding the strengths and limitations of such natural-language AI platforms is essential for both patients and healthcare providers. Incorporating these models into healthcare systems alongside ongoing improvements holds promise for enhancing healthcare delivery. Nonetheless, it is vital to recognise that current AI models cannot replace the pivotal role of healthcare providers in delivering personalised care.
Conflict of interest
The authors have no conflicts of interest to disclose.
Correspondence: Dr Satoshi Honda, Cardiovascular Medicine, National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shimmachi, Suita, Osaka, 564-8565, Japan. Email: [email protected]
REFERENCES
- Haug CJ, and Drazen JM. Artificial Intelligence and Machine Learning in Clinical Medicine. N Engl J Med 2023;388:1201-8.
- Ambale-Venkatesh B, Yang X, Wu CO, et al. Cardiovascular Event Prediction by Machine Learning: The Multi-Ethnic Study of Atherosclerosis. Circ Res 2017;121:1092-101.
- Koh SJQ, Yeo KK, Yap JJL, et al. Strengths and limitations of ChatGPT and natural-language artificial intelligence models for patient education on coronary angiogram. Ann Acad Med Singap 2023;52:374-7
- Sarraju A, Bruemmer D, Van Iterson E, et al. Appropriateness of cardiovascular disease prevention recommendations obtained from a popular online chat-based artificial intelligence model. JAMA 2023;329:842-4.
- Kusunose K, Kashima S, Sata M. Evaluation of the Accuracy of ChatGPT in Answering Clinical Questions on the Japanese Society of Hypertension Guidelines. Circ J 2023;87:1030-3.