Health care, including hospital medicine, isn’t exactly known as the professional land of early adopters—and for good reason.
The regulations that govern the care of hospitalized patients can’t allow just every new application, website, or technology to get its hands on patient information. And hospitalists can’t just rely on Google searches to come up with the right approach to a particularly difficult diagnosis.
But can it help? Can hospital medicine embrace the latest technology in real time?
Well, ChatGPT is a test case. The large-language model-based chatbot has taken the world by storm over the past year and it has hospitalists asking each other, “What are the real use cases in clinical care, their pros and cons, and rules of the still-being-built road?”
“Like all new technology, there is a lot of hype that goes with it,” said Vineet Arora, MD, MAPP, MHM, (@FutureDocs), dean for medical education at the Pritzker School of Medicine at the University of Chicago. “I think of other technologies that have come, that people have hyped up in health care, and I think this one is one we’re looking at the version 1.0 for a large language model.”
Not yet ready for primetime
For the technologically uninitiated, ChatGPT stands for Chat Generative Pre-trained Transformer. It is an artificial intelligence (AI) technology that uses available content online to generate answers in response to questions posed by users. It mimics interpersonal dialogue as closely as it can, to make back-and-forth communication more akin to conversation. Multiple companies have similar technology, so there are different chatbots available for use online.
Generative AI technology, like when the first iPhone was delivered in 2007, remains in its earliest phases, but—again like the iPhone—it has already begun pushing its way into everyday societal use faster than anyone but its creators could foresee.
Just not quite in health care yet.
“It’s not ready for health care primetime,” Dr. Arora said. “In contrast, Google has released Med-Palm as an alternative specific for health care. It has a more limited release so they can make it ready for healthcare primetime but looks promising given the early results.”
“I think everyone is excited about (ChatGPT), and it’s slowly starting to sneak into clinical work in a couple of different ways,” said hospitalist Subha Airan-Javia, MD, FAMIA, (@subhaairan), who practices at Hospital of the University of Pennsylvania in Philadelphia. But “I think it’s far from widespread right now.”
Dr. Airan-Javia, a former associate chief medical informatics officer, says the most immediate use for ChatGPT in its current iteration is more as a resource than an end tool. In clinical decision making, she sees the opportunity potentially to streamline thought processes—to use it as a reference or resource to help find key information about a diagnosis, symptom, or contraindication. And it doesn’t require integration into your clinical systems—another plus.
“Additionally, when using it as a reference, you don’t need to enter any patient health information, which of course keeps things simpler from a security and compliance perspective,” she said. “For example, one can say, ‘I have a patient with this background and these symptoms. What diagnoses should I be thinking of?’ I have used it multiple times a day while I’m on service to help think through certain scenarios or diagnoses that I might not be as familiar with, and once it gives me that first pass, I then go and dig deeper, cross-checking with other sources such as UptoDate. Dr. Peter Lee from Microsoft says, ‘Trust, but verify.’ That is a good motto to go by when first starting with ChatGPT.”
Dr. Airan-Javia says a recent example for her was a patient presenting with bacteremia.
“The patient had a fairly atypical organism growing in their blood culture. I wasn’t sure where or what the source was, and it was an organism I wasn’t as familiar with,” she said. “So to get me started, I asked ChatGPT, ‘Can this particular organism be seen as a contaminant, or does it tend to come from a particular source in the body? And what should I be thinking about?’”
ChatGPT gave an answer, and Dr. Airan-Javia fed it a few more details and got an even more tailored response. She didn’t view the AI-generated answers as the endpoint of research, but more like a waypoint to see which path to head down next.
“That was really helpful,” she said. “I then went and researched a few more details in more typical sources and knew what to do next. But that initial step with ChatGPT summed up a large amount of research in a few paragraphs that helped get me started.”
That last step—a doctor verifying by doing more diagnostic homework—is key, particularly in the still-early days of ChatGPT.
Beth Israel Deaconess Medical Center (BIDMC) professionals published a study in the summer of 2023 that assessed the ability of one generative AI to provide accurate medical diagnoses.1 The AI selected the correct diagnosis 39% of the time and provided the correct diagnosis in its list of potential diagnoses in 64% of challenging cases, according to a BIDMC announcement.
“While chatbots cannot replace the expertise and knowledge of a trained medical professional, generative AI is a promising potential adjunct to human cognition in diagnosis,” first author Zahir Kanjee, MD, FACP, MPH, a hospitalist at BIDMC and assistant professor of medicine at Harvard Medical School in Boston, said in the press release.
“It has the potential to help physicians make sense of complex medical data and broaden or refine our diagnostic thinking. We need more research on the optimal uses, benefits, and limits of this technology, and a lot of privacy issues need sorting out, but these are exciting findings for the future of diagnosis and patient care.”2
Garbage in, garbage out
Dr. Arora notes that—similarly to what happened when WebMD debuted—one of the first interactions for many hospitalists with ChatGPT will be patients using it in conversation with them.
“Absolutely, it’s like Google,” Dr. Arora said. “My search engine now has ChatGPT in enabled searches at the top. I do think it’s going to be part of health care in the sense that patients are using it, just like they’re using ‘Dr. Google.’”
But that concerns Dr. Arora as some patients may not come to the hospital armed with the right information. And in the context of physical exams, how patients describe what is happening to them—or what they think ChatGPT is telling them happened to them—can cause downstream issues.
“More importantly, it’s about misinformation,” Dr. Arora said. “It’s garbage in, garbage out. There’s a lot of garbage out there that ChatGPT can learn on. And that’s dangerous. It goes back to credible sources. So, yeah, you should be asking your patients what they are learning from. What are they finding out? Trying to be proactive.
“We don’t dismiss empowered patients looking around for things. That’s important. But it’s also important that we highlight the pitfalls and how to process the information with the right degree of skepticism, if you will.”
Dr. Arora’s skepticism is mostly based on the technology’s use in clinical care. She sees burgeoning uses for ChatGPT in academic circles, where it can act almost as a teaching assistant. Similarly, she could envision a multitude of coding and billing applications to streamline that process for practitioners.
Government regulations for privacy and health care will also limit how fast ChatGPT is adapted for clinical use.
“One thing to keep in mind is, in all technology innovations, the health care sector has usually lagged behind because of regulations and because of our unique privacy issues and our unique data and billing issues,” Dr. Arora said. “The focus on tech-based conversation—it’s the currency of our field. How we make a diagnosis is conversation.”
One concern many hospitalists have about ChatGPT is the fear of getting left behind. Like cell phone users who clung to non-touch screens, not embracing generative AI technology is likely going to leave some practitioners behind as the years pass.
Dr. Airan-Javia cautions against worrying too much about that, though.
“The rate of change is exponential,” Dr. Airan-Javia said. “I think we’ll get there faster than we’d normally think. But, for the use cases of actually going through a chart and finding the information and trusting it…we have a lot more work and research to do, to ensure that the information we get from these models is accurate.”
In addition, Dr. Arora notes that “an appropriate amount of skepticism” is diligent for now. She looks at electronic health records (EHR) systems as an example, where early mistakes caused problems that lasted for longer than they needed to.
“If you have a bad process, and you automate it, you can really hurt a lot of people in health care,” she said. “What I have seen a lot of right now is the testing. Does ChatGPT, can it take tests? Can it make diagnoses? And it is kind of all over the place. That obviously gives everyone pause when it’s not consistent.”
“Trust is going to play a huge part in how people rely on, or don’t rely on, tools that use generative AI,” said Dr. Airan-Javia, chief executive officer of CareAlign, an EHR workflow application created at Penn Medicine in Philadelphia. “We are okay with humans making mistakes, but when a machine makes even a tenth of the mistakes, we don’t accept that.
“We expect to see a much higher accuracy with any GenAI model. Whether that’s appropriate or not, that’s where we are currently across the board, and what we will need before we can use it more actively to interpret patient-specific data and charts.”
Richard Quinn is a freelance writer in New Jersey.
- Kanjee Z, Crowe B, Rodman A. Accuracy of a generative artificial intelligence model in a complex diagnostic challenge. JAMA. 2023;330(1):78-80.
- Mitchell J. Researchers test AI powered chatbots medical diagnostic ability. Beth Israel Lahey Health website. https://www.bidmc.org/about-bidmc/news/2023/06/researchers-test-ai-powered-chatbots-medical-diagnostic-ability. Published June 15, 2023. Accessed November 1, 2023.