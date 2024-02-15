



Just two months after Google announced Gemini, the large-scale language model it hopes will take it to the top of the AI ​​industry, the company has already announced its successor. Google today released Gemini 1.5, making it available to developers and enterprise users ahead of a full public rollout soon. The company has made it clear that it is fully committed to Gemini as a business tool, personal assistant, and everything in between, and is moving hard on its plans.

Gemini 1.5 has many improvements. The general-purpose model of Google's system, the Gemini 1.5 Pro, is clearly on par with the company's recently launched high-end Gemini Ultra, and outperforms the Gemini 1.0 Pro by 87%. A look at the benchmark test. It is created using an increasingly popular technique known as a Mixture of Experts (MoE). This means that instead of always processing the entire model when you submit a query, it only executes parts of the entire model. (Here's a good explanation of this.) This approach should make the model faster to use and make Google run more efficiently.

But there's one new thing about Gemini 1.5 that has CEO Sundar Pichai and the entire company particularly excited. Gemini 1.5 has a huge context window, which means you can handle much larger queries and see more information at once. That window is a whopping 1 million tokens compared to 128,000 tokens for OpenAIs GPT-4 and 32,000 tokens for current Gemini Pro. Tokens are a difficult metric to understand (here's a detailed breakdown), so Pichai explains it simply. That's about 10-11 hours of video and tens of thousands of lines of code. Context windows mean you can ask the AI ​​bot questions about all your content at once.

(Pichai also said that Google researchers are testing a 10 million-token context window that would run the entire Game of Thrones series at once.)

While explaining this to me, Pichai bluntly pointed out that you can fit the entire Lord of the Rings trilogy into that context window. This seemed too specific so I asked him. This has already happened, right? Someone at Google will check to see if Gemini has spotted any continuity errors, tries to understand Middle-earth's complex genealogy, and finally wonders if the AI ​​will be able to figure out the meaning of Tom's Bombadil. I'm currently checking. That certainly happened, Pichai says with a laugh, or one of those things will happen.

Pichai also believes that expanding the context window can be very helpful for businesses. This enables use cases where you can add a lot of personal context and information at the moment of a query, he says. Consider that the query window has expanded significantly. He imagines that filmmakers might upload entire films and ask Gemini for critics' opinions. He sees companies using Gemini to examine large amounts of financial records. I think this is one of the great strides we've made, he says.

At this time, Gemini 1.5 is only available to business users and developers through Google's Vertex AI and AI Studio. It will eventually be replaced by Gemini 1.0. The standard version of Gemini Pro, available to everyone on gemini.google.com and in-house apps, will be 1.5 Pro with a 128,000 token context window. You will need to pay an additional fee to reach 1 million. Google is also testing the safety and ethical boundaries of the model, especially with regard to the new and larger context window.

Google is currently in fierce competition to build the best AI tools, and companies around the world should explore their own AI strategies and sign developer deals with OpenAI, Google, or other companies. I'm trying to figure out what. Just this week, OpenAI announced its memory for ChatGPT, and it looks like it's getting ready to move into web search. So far, Gemini looks impressive, especially for those already in Google's ecosystem, but there's a lot left to be done on all fronts.

Ultimately, Pichai told me, these 1.0's and 1.5's, pro's and ultra's, and corporate battles will become less important to users. People will just consume the experience, he says. It's like using a smartphone without constantly paying attention to the processor underneath it. But for now, he says, we're still at the stage where everyone knows what chip is in their phone because it matters. The underlying technology is changing very rapidly, he says. people care.

