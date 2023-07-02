



Recent advances in AI have been eye-opening. Barely a week has gone by without a new algorithm, application, or impact making the headlines. But OpenAI, the source of much of the hype, has only recently perfected its flagship algorithm, GPT-4, and according to OpenAI CEO Sam Altman, its successor, GPT-5. I have not started training yet.

It’s possible the tempo will slow down in the coming months, but don’t bet on it. Any new AI model that has the same or more capabilities than GPT-4 may sooner or later become obsolete.

Google DeepMind CEO Demis Hassabis said in an interview with Will Knight this week that the next big model, the Gemini, is currently in development and “that process will take months.” Hassabis said Gemini is a mash based on AI’s biggest hits, notably DeepMind’s AlphaGo, which used reinforcement learning to beat the Go champion in 2016 years before experts predicted the feat. It is said that it will be up.

“Broadly speaking, Gemini can be thought of as combining some of the strengths of AlphaGo-type systems with the amazing language capabilities of large models,” Hassabis told Wired. “There are also some very interesting new innovations.” Overall, the new algorithms should be better at planning and problem solving, he said.

Era of AI fusion

Many of the recent advances in AI are due to algorithms becoming increasingly large and consuming more data. As engineers increased the number of interconnects and parameters and began training them on internet-scale datasets, the quality and functionality of the model improved like clockwork. As long as the team had the money to buy the chip and access the data, progress was almost automatic, as the algorithm’s structure, called Transformers, didn’t have to change significantly.

And in April, Altman said the days of large-scale AI models are over. Training costs and computing power skyrocketed, but scaling gains plateaued. “We’re going to improve it in other ways,” he said, without elaborating on what the other ways might look like.

GPT-4, and this time Gemini, provide clues.

At Google’s I/O developer conference last month, CEO Sundar Pichai announced that Gemini was under development. He said the company was built “from the ground up” to be multimodal, meaning that it can train and fuse multiple types of data, such as images and text, and is designed for API integration (think plugins). said it is. Add in reinforcement learning and perhaps other deep-minded specialties in robotics and neuroscience, Knight speculates, and the next step in AI is starting to look like a high-tech quilt.

But Gemini is not the first multimodal algorithm. I’m also not new to reinforcement learning or supporting plugins. OpenAI integrated all this into his GPT-4 to great effect.

If Gemini goes that far, it might rival GPT-4. The interesting thing is who is working on the algorithms. Earlier this year, DeepMind partnered with Google Brain. The latter he invented the first transformer in 2017. The former designed his AlphaGo and its successor. Combining DeepMind’s reinforcement learning expertise with large-scale language models can lead to new capabilities.

Additionally, Gemini could set the highest bar for AI without exponentially increasing in scale.

GPT-4 is believed to have about a trillion parameters, and recent rumors suggest that GPT-4 may be a “mixed-experts” model made up of eight smaller models. Yes, each model is a fine-tuned expert about the same size as the GPT-3. . For the first time, OpenAI did not publish the specifications of its latest model, although OpenAI has not confirmed the size or architecture.

Similarly, DeepMind is interested in creating a small model (Chinchilla) above its weight class, and Google is experimenting with an expert mix (GLaM).

Gemini may be a little bigger or smaller than GPT-4, but probably not by much.

Still, you may never know exactly how Gemini works, as increasingly competitive companies keep model details secret. That makes testing capabilities and controllability more important when developing advanced models, and the work Hassabis has proposed is also important for safety. He also said that Google may open Gemini-like models to outside researchers for evaluation.

He said he hopes the academic community will have early access to these frontier models.

It remains to be seen if Gemini will match or exceed GPT-4. As architectures become more complex, the gains may not be automatic. Still, Altman said the fusion of data and approach text with images and other inputs, large language models and reinforcement learning models, and stitching together smaller models into larger wholes makes AI better. may have been in mind when A method other than raw size.

When can I expect a Gemini?

Hassabis was vague about the exact schedule. If he says the training will take “months” to complete, it could be a while before Gemini launches. A trained model is no longer an endpoint. OpenAI spent months rigorously testing and fine-tuning GPT-4 in its raw state before its final release. Google may be even more cautious.

But Google DeepMind is under pressure to deliver a benchmark-setting product for AI, so it wouldn’t be surprising to see Gemini later this year or early next year. If so, and if Gemini satisfies both of those big question marks, Google could regain the spotlight from OpenAI, at least for the time being.

