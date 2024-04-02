



In a new study, scientists at Beth Israel Deaconess Medical Center (BIDMC) compared the clinical reasoning abilities of large-scale language models to their human physician counterparts. The researchers used the Revised IDEA (r-IDEA) score, a commonly used tool to assess clinical reasoning.

The study required a GPT-4-powered chatbot, 21 attending physicians, and 18 resident physicians to be given 20 clinical cases to establish diagnostic reasoning and work through. All three response sets were then evaluated using the r-IDEA score. The researchers found that the chatbot actually achieved the highest r-IDEA scores and actually proved to be very good at diagnostic reasoning. However, the authors also point out that chatbots were often simply wrong.

Stephanie Cabral, MD, lead author of the study, said further research is needed to determine how LLM can best be incorporated into clinical practice, but for now LLM can serve as a checkpoint. I explained that it could be possible and could help make sure it wasn't having any negative effects. miss something In summary, the results showed correct inferences by the chatbot, but also significant mistakes. This further strengthens the idea that these AI-powered systems are best suited (at least at their current level of maturity) as tools to enhance a physician's practice rather than replace their diagnostic capabilities.

San Francisco, CA – November 6: OpenAI CEO Sam Altman speaks at the OpenAI DevDay event… [+] November 6, 2023 in San Francisco, California. Mr. Altman gave the keynote address at his first-ever Open AI DevDay conference. (Photo by Justin Sullivan/Getty Images)

Getty Images

As often explained by physician leaders and engineers, this is because the practice of medicine is not based purely on the algorithmic output of rules, but rather on deep reasoning and clinical intuition; Because this is difficult for LLMs to replicate. . Nevertheless, such tools that can provide diagnostic or clinical support can still be a very powerful asset in a physician's workflow. For example, if the system can reasonably provide initial or initial diagnosis suggestions based on available data such as the patient's medical history and existing records, physicians have the potential to save significant time in the diagnostic process. there is. Additionally, these tools can enhance physicians' workflows and increase efficiency if they improve the means by which they process large amounts of clinical information from medical records.

Many organizations are leveraging these potential avenues for clinical augmentation. For example, artificial intelligence-powered scribing technology leverages natural language processing to help physicians complete clinical documentation more efficiently. Enterprise search tools are integrated within an organization and with her EMR system, allowing physicians to search large amounts of data, promote data interoperability, and glean faster and deeper insights into existing patient data. I'm doing it. Other systems may also be useful in providing initial diagnosis. For example, tools are emerging in radiology and dermatology that can analyze uploaded photos and suggest potential diagnoses.

Nevertheless, there is still much work to be done in this area. Simply put, such AI systems are not yet ready for clinical diagnosis, but there may still be opportunities to leverage this technology to enhance clinical workflows. In particular, humans must be kept up-to-date to ensure safe, secure, and accurate processes.

