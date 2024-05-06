



Every day, physicians treat large numbers of patients with needs ranging from the simple to the highly complex. Providing effective care requires knowing each patient's health records and staying up to date on the latest procedures and treatments. And then there is the all-important doctor-patient relationship, which is built on empathy, trust, and communication. AI needs to be able to do all of these things if it is to come close to emulating real-world doctors.

The intersection of AI and healthcare is in full swing. In the past six months, New Atlas has shown how junior doctors can identify precursors to colon cancer, how images of the eye can diagnose childhood autism, and whether surgeons can remove all the cancerous tissue in breast cancer. We reported on an AI model that helps predict in real time whether Surgery. But Med-Gemini is something else.

Google's Gemini model is a new generation of multimodal AI models. This means that it can process information from different modalities, such as text, images, video, and audio. The model is proficient in language and conversation, understands the diverse information it is trained on, and understands so-called long-context inference, that is, inference from large amounts of data, such as hours of video or dozens of hours of audio. .

Med-Gemini has all the benefits of the basic Gemini model, but with some tweaks. The researchers tested these medically focused adjustments and described their results in a paper. There's a lot going on in a 58-page paper. We have selected the most impressive parts.

Self-training and web search capabilities

To arrive at a diagnosis and develop a treatment plan, a physician uses his or her medical knowledge and information such as the patient's symptoms, medical history, surgical history, social history, results of laboratory and other investigational tests, and the patient's response. You need to combine it with a lot of other relevant information. Prior treatment. Treatments are a traveling feast where existing treatments are updated and new treatments are introduced. All of these influence physicians' clinical reasoning.

That's why in Med-Gemini, Google has built in access to web-based search that enables more sophisticated clinical reasoning. Like many medicine-focused large-scale language models (LLMs), Med-Gemini can solve problems from the United States Medical Licensing Examination (USMLE) designed to test medical knowledge and reasoning across a variety of scenarios. I was trained on MedQA, which is a representative multiple-choice question.

How Med-Gemini's self-training and web search tools work

Saab et al.

However, Google has also developed two new datasets for the model. The first, MedQA-R (Reasoning), extends his MedQA with synthetically generated reasoning explanations called Chain-of-Thoughts (CoT). The second, MedQA-RS (Inference and Retrieval), provides instructions to the model to use web search results as additional context to improve answer accuracy. When a medical question yields an uncertain answer, the model is asked to perform a web search to obtain further information to resolve the uncertainty.

Med-Gemini was tested on 14 medical benchmarks, established new state-of-the-art (SoTA) performance on 10 medical benchmarks, and outperformed the GPT-4 model family on all comparable benchmarks. On the MedQA (USMLE) benchmark, Med-Gemini achieved 91.1% accuracy using an uncertainty-based search strategy, outperforming Google's previous medical LLM, Med-PaLM 2, by 4.5%.

Med-Gemini outperforms GPT-4 by an average relative margin across seven multimodal benchmarks, including the New England Journal of Medicine (NEJM) Image Challenge (images of difficult clinical cases with a diagnosis made from a list of 10). showed good performance. 44.5%.

The researchers said that while the results were promising, more significant research was needed. For example, we do not consider using multimodal search searches to limit search results to more authoritative medical sources or to perform analysis on the accuracy and relevance of search results or citation quality. Additionally, it remains to be seen whether his smaller LLM will also be taught how to use his web search. We leave these explorations for future work.

Retrieving specific information from long electronic medical records

Electronic health records (EHRs) can be long, but doctors need to be aware of what's in them. Complicating matters is that textual similarities (diabetes and diabetic nephropathy), misspellings, acronyms (Rx and prescription), and synonyms (cerebrovascular accident and stroke) are typically included; can be a challenge for AI.

To test Med-Gemini's ability to understand and reason about long-contextualized medical information, researchers used the Medical Information Mart for Intensive Care (MIMIC), a large, publicly available database. and performed the so-called needle-in-a-haystack task. III, contains anonymized health data of patients admitted to intensive care.

The goal of the model was to search a large collection of clinical notes in the EHR (the haystack) for relevant mentions of rare and subtle medical conditions, symptoms, or procedures (the needle).

Two hundred cases were selected, and each case consisted of a collection of anonymized EHR notes from 44 ICU patients with long medical histories. The following criteria were required.

Over 100 medical notes. The length of each example ranges from 200,000 to 700,000 words. In each example, the condition is mentioned only once. Each sample has one condition of interest.

There were two steps to putting a needle in a haystack. First, Med-Gemini had to search its extensive records for all mentions of a particular medical problem. Second, the model had to evaluate the relevance of all mentions, classify them, conclude whether the patient had a history of the problem, and provide a clear basis for the decision.

Examples of Med-Gemini's long context functionality

Saab et al.

Compared to the SoTA method, Med-Gemini showed better performance on the needle-in-a-haystack task. It was rated 0.77 in precision compared to the SoTA method (0.85) and outperformed the SoTA method in recall (0.76 vs. 0.73).

Perhaps the most notable aspect of Med-Gemini is its long-context processing capabilities, which opens up new performance frontiers for medical AI systems and novel application possibilities not previously possible. , the researchers say. This needle-in-a-haystack search task reflects the real-world challenges faced by clinicians, and Med-Gemini-M 1.5's performance shows how to efficiently extract information from vast amounts of data. By extracting and analyzing the information, it has been shown that it has the potential to significantly reduce the cognitive load and improve the abilities of clinicians. Patient data.

For an easy-to-understand explanation of these key research points, and the latest information on the slugfest between Google and Microsoft, watch the AI ​​Explained video starting at 13:38.

New OpenAI models 'imminent' raises stakes in AI (plus Med Gemini, GPT 2 chatbots, Scale AI)

Conversation with Med Gemini

In a real-world usability test, Med-Gemini was asked by patient users about itchy skin lumps. After requesting an image, the model asked appropriate follow-up questions to accurately diagnose the rare lesion and recommend what the user should do next.

Example of a Med-Gemini diagnostic dialogue in a dermatology setting

Saab et al.

Med-Gemini was also asked to interpret chest X-rays for doctors and create a plain English version of the report that could be provided to patients while awaiting official reports from radiologists. .

Med-Gemini diagnostic dialogue support in radiology environment

Saab et al.

Researchers say Med-Gemini-M 1.5's multimodal conversation capabilities are promising given that they can be achieved without fine-tuning specific medical interactions. Such capabilities enable seamless and natural interactions between humans, clinicians, and AI systems.

However, researchers acknowledge that more research is needed.

While this feature has great potential for useful real-world applications such as helping clinicians and patients, it also comes with some very serious risks. While highlighting the potential for future research in this area, we highlight the capabilities of clinical conversations in this study, as previously explored by others in dedicated research towards conversational diagnostic AI. I haven't rigorously benchmarked it.

future vision

Where do we go from here? The researchers acknowledge that there is still much work to be done, but the initial features of the Med-Gemini model are certainly promising. Importantly, we plan to incorporate responsible AI principles such as privacy and fairness throughout the model development process.

Researchers said privacy considerations, in particular, need to be rooted in existing health care policies and regulations that govern and protect patient information. Equity is also an area that requires attention, with the risk that AI systems in healthcare may unintentionally reflect or amplify historical biases and inequalities, leading to potentially disparate model performance and May have harmful consequences.

But ultimately, Med-Gemini is seen as a tool for good.

Large-scale multimodal language models are ushering in a new era of possibilities in health and medicine, researchers said. The capabilities demonstrated by Gemini and Med-Gemini signal significant advances in the depth and breadth of opportunities to accelerate biomedical discovery and support healthcare delivery and experience. Most importantly, however, advances in model functionality are accompanied by careful attention to the reliability and safety of these systems. By prioritizing both aspects, we can responsibly envision a meaningful future where the capabilities of AI systems safely advance both scientific progress and healthcare.

The study can be accessed through the preprint website arXiv.

Sources 1/ https://Google.com/ 2/ https://newatlas.com/technology/google-med-gemini-ai/ The mention sources can contact us to remove/changing this article

