A trio of pediatricians at Cohen Children’s Medical Center, in New York, has found ChatGPT’s pediatric diagnostic skills to be considerably lacking after asking the LLM to diagnose 100 random case studies. In their study, reported in the journal JAMA Pediatrics, Joseph Barile, Alex Margolis and Grace Cason tested ChatGPT’s diagnostic skills.
Pediatric diagnostics is particularly challenging, the researchers note, because in addition to taking into account all the symptoms found in a particular patient, age must be considered as well. In this new effort, they noted that LLMs have been promoted by some in the medical community as a promising new diagnostic tool. To determine their efficacy, the researchers assembled 100 random pediatric case studies and asked ChatGPT to diagnose them.
To keep things simple, the researchers used a single approach in querying the LLM for all the case studies. They first pasted in the text from the case study, and then followed up with the prompt “List a differential diagnosis and a final diagnosis.”
A differential diagnosis is a methodology used to suggest a preliminary diagnosis (or several of them) using a patient’s history and physical exams. The final diagnosis, as its name suggests, is the believed cause of the symptoms. Answers given by the LLM were scored by two fellow colleagues who were not otherwise involved in the study—there were three possible scores, “correct,” “incorrect” and “did not fully capture diagnosis.”
The research team found that ChatGPT produced correct scores just 17 times—of those, 11 were clinically related to the correct diagnosis but were still wrong.
The researchers note the obvious: ChatGPT is clearly not yet ready to be used as a diagnostic tool, but they also suggest that more selective training could improve results. They further suggest that in the meantime, LLMs like ChatGPT may prove useful as an administrative tool, or to assist in writing research articles or for generating instruction sheets for patient use in aftercare applications.
More information:
Joseph Barile et al, Diagnostic Accuracy of a Large Language Model in Pediatric Case Studies, JAMA Pediatrics (2024). DOI: 10.1001/jamapediatrics.2023.5750
© 2024 Science X Network
Citation:
ChatGPT found to have very low success rate in diagnosing pediatric case studies (2024, January 4)
retrieved 4 January 2024
from https://medicalxpress.com/news/2024-01-chatgpt-success-pediatric-case.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Shannen underwent surgery to have a tumor removed from her brain in January 2023. “We…
Sign up for the View from Westminster email for expert analysis straight to your inboxGet…
CLOSE to 5,000 jobs as domestic helpers are up for the grabs due to high…
In 2017, the first black-bellied whistling ducks were reported in Vanderburgh County, Indiana.The following year,…
Allies of Donald J. Trump are proposing that the United States restart the testing of…
Parents who steal a moment of calm by handing over a smartphone or tablet to…
This website uses cookies.