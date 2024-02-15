



Another day, another generative AI update.

Google's AI subsidiary DeepMind has previewed Gemini 1.5 Pro, an upgraded version of Google's Bard chatbot Gemini. Gemini got its new name less than a week ago with the release of its premium paid version Ultra, which Google calls “our largest, most capable, cutting-edge AI model.”

Gemini 1.5 Pro is the latest evolution of Google's chatbot, which also recently added the ability to generate images from text.

Gemini 1.5 Pro can capture video, images, audio, and text to answer questions, and while it boasts a number of advantages over its predecessor, most of us still won't be able to get our hands on it. can not. DeepMind announced in a conference call with the press on Wednesday that it would first provide access to developers and enterprise customers.

Oriol Vinyals, vice president of research at Google DeepMind and co-leader of Gemini, called it a “research release” aimed at a “technology-savvy audience.”

“When creating a new model, especially when unlocking some new functionality, it makes sense for a creative mind to see what it can do to understand what this model does. I think it's appropriate. [do], how does this ultimately affect the user? ” Vinyals added.

DeepMind plans to “slowly roll out” to regular Joe and Jane through a waiting list.

The limited release of Gemini 1.5 Pro comes amid increased activity in a sector predicted to reach $1.3 trillion in revenue by 2032. Meanwhile, ChatGPT maker OpenAI released the GPT-4 Turbo large-scale language model, allowing anyone to create custom AI apps. That app store. Microsoft plans to add a special key to Windows 11 laptops and PCs to launch its AI tool Copilot.

Performance improvements, new architecture, and longer context windows

Gemini 1.5 Pro has “feature parity” with the Gemini 1.0 Ultra model that Google announced on February 8th. The 1.5 Pro model's win rate (a measure of how well it can outperform the benchmark) is 87% compared to the 1.0. 55% for Pro and 1.0 Ultra. So the 1.5 Pro is essentially an upgraded version of the best model currently available.

Research progresses rapidly. “We're getting these kinds of breakthroughs and new model versions every few months,” Vinyals said. “What would you try to do if the model became more capable? [do is] Basically, we want it to be a drop-in replacement from the previous generation in the sense that it's more capable, so we can do what it was already doing, and hopefully do it better. ”

According to Vinyals, 1.5 Pro is also “very efficient” thanks to its unique architecture, which allows it to focus on expert sources on a particular subject rather than searching for answers from all possible sources. can answer your questions.

Finally, 1.5 Pro has a longer context window and can capture up to 1 million tokens. This equates to 1 hour of video or 11 hours of audio, 30,000 lines of code, or 700,000 words.

“The longer and more complex the questions and interactions, the more context the model needs to process,” says Vinyals.

Gemini 1.5 Pro in action

So, for example, you can feed 1.5 Pro recordings of Apollo 11 and ask the AI ​​to find funny moments. Or you can share a rudimentary drawing and ask your model to find a moment depicted in the sketch, such as a quote from Neil Armstrong's “One Small Step For Man.”

“This is an example of uploading a very long document that you might not have had time to read, and then actually manipulating it in this really interesting way,” Vinyals said.

Gemini 1.5 Pro users can also ask the model to find specific moments in videos, including silent Buster Keaton movies, via text and images.

Finally, Gemini 1.5 Pro can “work in many languages” including Spanish.

When I fed the 1.5 Pro with a grammar book and a dictionary of Karaman, a language from Western New Guinea with fewer than 200 speakers, the model was able to translate the sentences into English.

“It takes a few seconds to process, and now you're an expert in this language,” Vinyals said.

However, Gemini 1.5 Pro is still subject to common problems such as hallucinations.

“This model sometimes fails, and we are working as a community to improve these models,” Vinyals said. “But, of course, they're very useful, so you just have to understand their limitations.”

What about Ultra 1.0?

Less than a week ago, Google announced Gemini Advanced, a new “experience” that gives consumers who pay $20 a month access to Ultra 1.0 AI models.

Does that mean Google has retired Ultra 1.0? Not according to Viñals. “It will be a while before we release 1.5 Pro.”

