Skip to main content
Audiophile Logo
HomeBlogsRead Post

Random Musings

Generative AI Music
By Larry Gee
Posted: 2024-09-30T10:12:00Z


Image Source: Generated using StableDiffusion

Image Prompt: Create a photorealistic close-up of a futuristic robotic hand using a quill pen to write a partially completed music score on a parchment paper with some shadows.



Generative AI Music - A paradigm shift for music creation




1. Intro



In February 2023, I wrote about how ChatGPT could be used in writing. We now know generative AI can do everything from generating exercise programs, passing law exams, building websites, and creating photorealistic images to creating election disinformation and deepfakes. In May of this year, OpenAI announced GPT-4o, a chatbot capable of responding to text and audio, images, and video. This chatbot can solve both visual and linguistic tasks. Chatbots, like GPT-4o, using multimodal Large Language Models (LLMs) can turn still images into video clips, name varieties of plants in a photo, analyze a photo and identify where a photo was taken, and describe what's in a camera's field of view to assist blind and low-sighted people. The generative AI genie is out of the bottle, and today I'm writing about how generative AI tools are used to make music, ways you can try popular tools, and my initial impression of the sound.



2. How Does it Work


With the introduction of generative AI, the new field of prompt engineering has emerged. Developing and refining text prompts is essential for creating meaningful output from generative AI-based systems, including music generation tools.

To create music, you simply instruct the tool to create music by entering a text prompt. The tool will read the prompt and attempt to produce music (and lyrics) similar in style to the prompt's. Prompts can range from simple to complex, including using meta tags to define the structure of the composition. For a basic introduction to music generation prompts, read this blog post on Limewire.com. 

For those interested in a deeper dive into the inner workings of AI music generation, start at the GIT Hub - Awesome Music Generation with AI page where you can begin your journey down the AI music generation rabbit hole. This page is a comprehensive curated collection of resources, projects, and frameworks for musicians, researchers, and developers who wish to delve into the latest developments in the world of AI music generation.



3. Take AI Music Generation for a Test Spin


To get a taste of what AI-generated music sounds like, check out this site that summarizes and ranks the "8 Best AI Music Generators in 2024". Suno and Udio are the most popular platforms at the moment. Try each site and click on the various available sample tracks to experience AI generated music.. If you wish to create your own music, you will need to create a free account and sign in. Spending a few minutes on each site will still give you a feel for the quality of sound you can expect from AI music.



4. What About The Sound


I found some of the music to be quite listenable. However, with traditional recordings, a good recording can transport you to a place where the artist performed live or in studio. My brief experience with the current generation of tools has shown me that AI-generated music is capable of none of that. While the better tools can create interesting melodies and lyrics, to be honest, much of the music sounds synthetic and somewhat repetitive. Some sounds, like the decay of cymbals, seem slightly truncated. The body and emotion generated by acoustic instruments seem lacking. Artifacts like these make the music sound artificially created.

The sound of AI-generated music is, for the most part, two-dimensional and, in a word, "lifeless" compared to human performances on a good recording. AI music, while enjoyable, lacks a degree of "realism" that could transport me to another place. The furthest place I was transported was to the nearest elevator, which sounded like elevator music. While acceptable for background music or listening in a car, most AI-generated music I've heard is not mentally engaging. The realism of the music will likely improve in the future, but for now, I feel that AI-generated music is only suitable for background music (BGM).



5. Conclusion


The music creation process, as we know it, will be forever changed with the introduction of AI-based music creation tools. Physical instruments are no longer required, nor are male and female vocals. With generative AI-based tools, anyone who can type a sentence can create music in any genre without having any background in music theory, without ever having played an instrument, or without ever having sung a note of music, all within minutes.


While the introduction of these tools has the potential for racially changing the landscape of the music industry by relegating some musicians, singers, and songwriters to the same fate as lumberjacks from the Pacific Northwest, these new tools, in my view, still have a way to go to recreate the emotion, excitement, and realism of a live human performance. Without a doubt, the tools can create music and lyrics at a blazingly fast pace, producing perfectly enjoyable BGM for gaming, studying, and casual listening. However, there is still much more room for improvement.


The current performance of generative AI music tools can be likened to the early days of computer-generated imagery (CGI) used to create human images. Back then, CGI images resembled humans, but there was always a slight disconnect that made them easily identifiable as CGI. However, with advancements in AI and generative adversarial networks (GAN), image generation evolved to create shockingly lifelike human images. AI-generated music will need to go through a similar maturation path.


Today's music generation tools usher in an era of new software-based "instruments" for creating music. These new instruments now give everyone the ability to turn text into music. During my experimentation with these tools, I have found that the energy of a believable live performance, at least for now, gets lost in translation, at least when it comes to acoustic instruments and vocals. A performance's energy, emotion, and three-dimensionality may come in time. Still, just as CGI struggled for years to recreate believable human characters, it will be some time before AI music creation tools create acoustic music equal to that made by our best live human singers and musicians.


AF_Logo_white