



Google has announced (but hasn't released) Gemini 1.5, an update to its flagship language model. This model was once used in a chatbot known as Bard, but due to synergy he was renamed Gemini a week ago.

The big claim of this release is “a breakthrough in understanding long-term context across modalities.” It's also built on an architecture type known as “Mixture-of-Experts” (MoE), which is meant to be a step up in terms of efficiency as well, with performance likely similar to Gemini 1.0, but This means relying on the GPU, which requires less power. Crank up to achieve it.

The first big claim about multimodal “long context” understanding is as jargon as it sounds, but the co-founders of Google Deepmind have released a demo on X that aims to show what this means in practice. I posted it.

Tweet may have been deleted

LLM focuses on what users need by making judicious use of large amounts of public domain text (in this case, the 402-page transcript of the NASA mission that landed on the moon) that won't offend copyright-minded people. You can narrow it down. Apparently that's what “long context” means, even though the prompt is absolutely huge (“long”) (“context”).

In the demo, Gemini 1.5 can extract three funny moments from a text as long as a novel. You can also find events in the transcript that match the photo of the moon boot print (you know, the part where Neil Armstrong walks on the moon), and what “multimodal” means in this context. We are clarifying what is happening. An image recognition model that works in conjunction with LLM.

SEE ALSO: Samsung Galaxy S24 series is powered by Google's Gemini AI model

This upgrade is part of OpenAI's continued efforts to keep Google involved in the AI ​​conversation after it released ChatGPT and ate everyone's AI lunch in 2022. Late last year, Google began seriously promoting changes to Bard and the model that underpins it. This model is a large-scale language model that is still running today, and has been built into popular Google and Android products, rather than being used like ChatGPT to solve everyday problems. is known. Today's issues and shocking events at cocktail parties. In particular, in a December 2023 research paper, he outperformed OpenAI's GPT-4 model in certain cases, resulting in a passing score on a specific AI test for “Multi-Task Language Understanding” or “Multi-Task Language Understanding.” The version of Gemini that became the first LLM earned was promoted. M.L.U.

Among other claims about Gemini 1.5, Google claims that the new model can handle large datasets with incredible accuracy and, although this is a somewhat eyebrow-raising claim, delivers superior performance for inference on all types of data. He says he will perform. Reasoning is the most famous weakness of most LLMs.

According to CEO Sundar Pichai, Google will release Gemini 1.5 to a limited group. “We're excited to offer a limited preview of this feature to our developers and enterprise customers,” Pichai said in a Google blog post.

If Gemini 1.5 is actually allowed to try out as part of an officially released product, the broader Gemini users will be the final arbiters of Google's performance claims. Gemini Ultra, Google's most powerful model, was released a week before his, so it may take a while, but Gemini 1.5 will someday be part of Google's new premium, or “paid”, Workspace package. It's safe to assume that it will. A product called Google One AI Premium.

Topic Artificial Intelligence Google

