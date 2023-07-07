



From coding with ChatGPT for software engineers to the Bings search engine sliding in place of the bi-weekly bite of Hinge, we’re obsessed with the ability of artificial intelligence to replace us.

Within the creative industries, this obsession manifests itself in generative AI. The popularity of generative AI calls into question how we understand the integrity of the creative process, as models like DALL-E generate images from text prompts. If generative models can embody ideas, what happens to artists if they can’t generate their own ideas?

MusicLM, Google’s new text-based music-generating AI, provides an interesting answer to this viral Terminator and Ex Machina story. As a model for generating high-fidelity music from textual descriptions, MusicLM leverages moments lost in translation and encourages creative exploration. It sets itself apart from other music generation models such as Jukedeck and MuseNet by encouraging users to verbalize their own ideas rather than switching between existing music samples.

it’s hard to explain how i feel

AI in music is nothing new. But from recommending songs for Spotify’s Discover Weekly playlist to composing royalty-free music by Jukedeck, the application of AI in music sidesteps the longstanding challenge of mapping words directly to music. It’s been done.

This is because music itself is a form of expression, and it sounds different depending on who listens to it. Just as different languages ​​struggle to perfectly convey the nuances of their respective cultures, it is difficult (if not impossible) to capture all aspects of music exhaustively in words.

MusicLM tackles this challenge by generating audio clips from descriptions such as gentle violin melodies backed by distorted guitar riffs, while also considering less visible inputs such as hypnosis and trance states. . The work tackles the thorny issue of music classification with a fresh sense of self. Rather than focusing on lofty notions of style, MusicLM builds on more specific musical attributes, tagging them as snappy or amateurish. Integrating more widely accepted concepts of genre and songwriting techniques, the audio explores where his clips come from (e.g. Youtube tutorials) and the common emotional responses they evoke (e.g. mad love). Consider broadly.

what you expect is not what you get

Beyond this theoretical problem of music classification is the practical problem of a lack of training data. Unlike its creative counterparts (such as DALL-E), it doesn’t have a wealth of out-of-the-box text-to-speech captions.

MusicLM was trained on a library of 5,521 music samples captioned by musicians called MusicCaps. MusicCaps is bound by the very limits of human capacity and by almost philosophical stylistic issues, so it offers limited granularity to the semantic interpretation of musical properties. As a result, there is an occasional gap between user input and generated output, and the happy, energetic song you seek may not be what you expected.

But when asked about this discrepancy, MusicLM researcher Chris Donahue and research software engineer Andrea Agostinelli praise the model’s human element. These describe key applications such as: [exploring] Ideas more efficiently [or overcoming] Note that MusicLM offers multiple interpretations of the same prompt, so even if one track generated doesn’t meet your expectations, another might not.

this [disconnect] This is a big research direction for us, but Andrea admits that there is no single answer. Chris sees this disconnect in the abstract relationship between music and text, and how we respond to music [even more] loosely defined.

In a way, MusicLM’s language-based structure positions the model as a consultation board, facilitating interactions that welcome moments lost in translation. Prompting the model with a vague idea, generating approximations can help you understand what you actually want to create. .

beauty is in breaking things

Using their experience producing the Grammy-nominated album “Chain Tripping (2019)” made entirely with MusicVAE (another music-generating AI developed by Google), the band YACHT explores the future of MusicLM in music production. I agree with As long as you can break it down and tinker a bit, I think it has a lot of potential, says frontwoman Claire L. Evans.

For YACHT, generative AI exists not as an end in itself, but as a means to an end. You can never make exactly what you want to make, says founding member Jonah Bechtold, explaining how studio sessions work. Claire adds that it’s because you have an imperfect conduit, and says that the fascinating and exciting process of making music is due to the unexpected disconnection that occurs when an artist puts pen to paper.

This band describes how the gap between user input and generated work fuels creativity through iteration.have a controversial nature [MusicLM] I think it’s the unreal feeling when you see something in the mirror that gives you feedback. Claire says it’s like a mirror for an entertainment center. Computer-accented band member Rob Kiewetter jokes, referring to a documentary about the band’s experience making Chain Trip.

However, in discussing the implications of this shift from text to audio production, Clare cautions against the rise of taxonomy in music. Imperfect semantic elements are great, and that’s what we should worry about. [labels] Making boundaries to discoveries and creations that don’t have to exist Everyone is conditioned to think of music as a salad of super-specific genre references [that can be used] to create new songs.

Nevertheless, YACHT and the MusicLM team agree that the current MusicLM is promising. Either way, Rob argues that many new artists will be fine-tuning the tool to their needs.

Engineer Andrea recalls an example of a creative tool not being popularized for its intended purpose. Synthesizers eventually opened up a huge wave of new genres and modes of expression. [It unlocked] A new way to express music even for non-musicians. Historically, predicting how each music technology will unfold has been rather difficult, concludes researcher Chris.

Happy coincidence, reinvention, self-discovery

Back to the stubborn and unforgiving question. Will generative AI replace musicians? Probably not.

The relationship between artists and AI is not linear. Prescribing a complex and carefully deliberate system of collaboration between an artist and an AI is tempting, but at the moment the process of using his AI for art making resembles a friendly game of trial and error. .

In music, AI gives us room to explore the potential space between what we describe and what it actually means. Bring your ideas to life in ways that help shape your creative direction. A tool like MusicLM can help prepare you for what you actually put on stage and for Discover Weekly by outlining these crucial moments lost in translation.

Tiffany Ng is an arts and technology writer based in New York. Her work has been featured in iD Vice, Vogue, South China Morning Post and Highsnobiety.

