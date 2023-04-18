



Google has announced an advanced AI music generator that can turn text snippets into songs.

AI revolution: ChatGPT, DALL-E 2 and other advanced AI that can generate impressive text and images in response to user prompts exploded in popularity in 2022, but they are the first It’s not generative AI, nor is it the only example. It can be done with neural networks.

Several companies have trained AI to generate music in response to text, voice, or image prompts. OpenAI, the research company behind ChatGPT and DALL-E 2, released an AI music generator called “Jukebox” in 2020.

These systems have not been as enthusiastically received as those that generate text and images, mainly because the output is not as impressive. Most systems are low fidelity, simplistic, and lack traditional song structures such as chorus repetitions. .

Introducing Jukebox, a neural network that generates raw audio for music, including rudimentary singing, in a variety of genres and artist styles. We are releasing a tool that allows anyone to explore the generated samples as well as the model and code: https://t.co/EUq7hNZv62 pic.twitter.com/sh5yHz7qrc

— OpenAI (@OpenAI) April 30, 2020

what’s new? However, music creation AI is improving, and perhaps the most impressive example of this technology is MusicLM, his AI music generator announced by Google in January 2023.

The system can generate clips up to 5 minutes long based on text descriptions. The music won’t win a Grammy, but the audio sounds more like a human recording than any other AI-generated clip.

How it works: Google trained MusicLM on over 280,000 hours of music sourced from MuLan, a model trained to link music to descriptions written in natural language.

We then created MusicCaps, a public dataset of over 5,500 music clips for use in evaluating AI music generators. Skilled musicians wrote captions for each of these clips and a list of aspects to describe the genre, mood, and more.

During the evaluation phase, Google compared MusicLM to two other text-to-music AIs (Mubert and Riffusion) with several quantitative metrics to assess a clip’s audio quality and adherence to the text description. used.

They also presented the MusicCaps description and two audio clips to human raters. These could be two AI-generated clips, or one AI-generated clip and the music that the MusicCaps description was based on. The raters then selected the clip they believed best matched the description.

According to a paper shared by Google on its preprint server arXiv, MusicLM generally outperformed other AIs.

“We strongly emphasize that more future work is needed in addressing these risks associated with music production.”

Agostinelli et al.

Looking ahead: Google’s AI music generator may be able to produce audio that closely resembles human-written music, but it still cannot reproduce the structure of traditional songs, and the vocals it produces are particularly poor in quality. It’s too low and I can’t understand the lyrics.

Google says future work on the system may focus on these issues, improve the overall quality of audio, and address issues that have prevented MusicLM from being released to the public. About 1% of that output is training data.

“We are aware of the potential misappropriation risks of creative content associated with our use cases…we strongly emphasize that more work is needed going forward to address these risks associated with music generation.” ‘ writes the researcher.

