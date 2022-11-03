



Google is very cautious about releasing text-to-image AI systems. The company’s Imagen models produce comparable quality output to OpenAI’s DALL-E 2 or Stability AI’s Stable Diffusion, but Google has not released the system to the public.

But today, the search giant announced the addition of Imagen to its AI Test Kitchen app in very limited form as a way to gather early feedback on its technology.

AI Test Kitchen was launched earlier this year as a way for Google to beta test various AI systems. The app now offers a few different ways to interact with Google’s text model LaMDA (yes, the same one that engineers thought was conscious), and the company announced Season 2 We will soon be adding a similarly constrained Imagen request as part of what we call . Update to app. In short, there are his two ways to operate Imagen, which Google demoed at The Verge ahead of today’s announcement: City Dreamer and Wobble.

In City Dreamer, users can ask models to generate city elements designed around a theme of their choice, such as pumpkins, denim, and color blaag. Imagen creates sample buildings and plots (town squares, apartments, airports, etc.) and all designs are displayed as isometric models similar to those seen in SimCity.

The City Dreamer task allows users to request isometric themed city buildings.Image: Google

Wobble creates little monsters. Choose the material (clay, felt, marzipan, rubber) and dress them in your favorite clothes. The model will generate your monster and give it a name. Again, the model’s output is limited to a very specific aesthetic. I think this is kind of a cross between his Pixars designs for Monsters, Inc. and his creator function for Spore characters. (Someone on the AI ​​team must be a Will Wright fan.)

These interactions are very limited compared to other text-to-image models and users cannot request what they like. As Josh Woodward, Google’s Senior Director of Product Management, explained to The Verge, the whole point of the AI ​​Test Kitchen is to a) get public feedback on these AI systems, and b) to help people Find out more about how to destroy them.

Woodward was reluctant to discuss specific examples of how AI Test Kitchen users broke the LaMDA functionality, but when asked to describe specific places the model It points out that one weakness has arisen.

Places mean different things to different people at different times in history, so we’re seeing some very creative ways people try to put a particular place into their system and see what it produces. When asked about a location that could produce a controversial description, Woodward cites the example of Tulsa, Oklahoma. He says there were a series of race riots in Tulsa in the 1920s. Even if someone puts in Tulsa, the model might not even reference it…and you can imagine it in places all over the world.

The Yuragi feature allows users to design monsters and make them dance.Image: Google

Read between the lines here: Imagine asking an AI model to describe the medieval town of Dachau, Germany. Do you want a model answer that references the Nazi concentration camps that were built there? How do you know if users are looking for this information? But is that acceptable? In many ways, the challenges of designing an AI model with a text interface are similar to the challenges of fine-tuning search. User requests should be interpreted in a way that satisfies the user.

Google doesn’t share data on how many people are actually using AI Test Kitchen (Woodward said it wasn’t intended to be a billion-user Google app), but the feedback we got is said to be very valuable. Engagement has far exceeded our expectations, says Woodward. A very active and contentious group of users. He says the app is helping reach specific types of people, researchers, policy makers, who can use the app to better understand the limits and capabilities of state-of-the-art AI models. increase.

Still, the big question is whether Google wants to bring these models to the wider public and, if so, what form it will take. , rushing to commercialize the text-to-image model. Will Google feel their system is secure enough to take it out of AI Test Kitchen and serve its users?

