When I began working on my previous post about creating motion imagery with AI, I searched for an AI tool that would help me write backing music for the video. Suno.ai had just been made available via Microsoft Copilot, so I gave it a shot.
Suno is a song generator (unlike the instrumental music I generated for this post), and it operates in much the same way as AI image generators do. You give a text prompt about what you want, and Suno calculates for a bit and spits out a 'song'. I use scare quotes because the song is just a one-minute clip that ends abruptly. Even so, the first time I entered a prompt and got a song back, I was flabbergasted at what is possible. I've been using AI image generation for a couple of years now, so I've gotten somewhat desensitized to the incredible technology behind it. But this level of music generation is very new and truly stunning.
SIXTEEN VERSIONS OF SLEEPWALK IN DIFFERENT MUSICAL STYLES
It's not without its flaws, though. In addition to the duration limitation of the generations, the prompt itself is extremely limited in length and effectiveness. Current image generators have grown to allow much longer contexts to describe the desired outcome, but Suno's prompt is hardly long enough to give a style of music and a few extra words to guide it. By default, Suno will create lyrics for you, and they are always severely lacking. It's kind of ironic that these Large-Language-Model-based applications are so bad at creative writing.
The first step to creating the song I used for the Sleepwalk video was to ask Suno to create songs in a variety of genres where intense and angry music was likely to result. I rerolled many times as I began to understand the capabilities of the lyric generation as well as the music itself.
I was hoping to have the lyrics be close to 100% AI-generated, so I culled the best lines from all the rerolls into a big document and picked and chose out of that document to create the first draft of the lyrics that would become the song. But Suno, ChatGPT, and Copilot, all being really terrible at creative writing, came up with very little usable stuff. So after I cobbled together the lyrics, I would end up changing most of them.
That process was aided by discovering the 'Advanced' mode on the Suno website, which does something clever; it allows you to write two separate prompts. You still get the too-short descriptive prompt for the music, but you also get a much longer area to write the lyrics that you want. By using simple prompting techniques, you can control the structure of the song to a certain degree. Another extremely valuable advanced feature is that you can choose a previously generated clip from which to continue, and then generate another minute or so of the song. The two clips will usually flow together pretty nicely.
Through many many (many) generations of partial clips in a variety of styles, I heard the pretty terrible lyrics over and over again, and I just... couldn't. So I found myself adjusting the lyrics little by little, trying to make them fit together rhythmically and trying to make them less banal. In the end, I changed or replaced at least 80% of the lyrics, though the general sense of them still may be felt. I hardly want to take credit for the lyrics though, as I am no lyricist and I was trying to maintain the contributions of the AI as best I could. The one line of the song that the AI must be given total credit for is the core lyric of the entire song: "Hey, hey, wake up, wake up. The world's on fire and you're fast asleep." This line completely represents what I asked for in the prompt, which used the words "sleepwalking into oblivion". Given the rest of the generated lyrics, I was quite surprised at how good that one was!
Trying to find the right style of music to back the video I was creating was quite difficult. Although the imagery in the video merited fast, loud, and intense music, the video generators had created video footage that had a slow-motion feel to it. Using a combination of a Rock song and a Punk song, I thought that my first cobbled-together effort at cutting a video was pretty good. But at the urging of my better half Krysia Lukkason, I decided to go back and try to create a song with a slower but still intense song along the lines of Closer by Nine Inch Nails.
It turns out that Trent Reznor is an above-average songwriter, and Suno is... not. Like many generative AI tools out there these days, Suno will not accept any artist or song names in its prompts, so you can't just say "Give me a song like Closer by Nine Inch Nails, please". The app will scold you and tell you to try again if you prompt it in that way.
So, I tried many many (many) times to find a song that worked in a way similar to the Nine Inch Nails classic. I started by trying to figure out what genre that song would be placed in. But really, are there any other songs like it? The closest I came was "Industrial Rock". That never worked very well for the video, so I continued to expand my horizons trying to find that balance between intensity and tempo that would work for the video.
The work of creating a song from intro to outro was exponentially more difficult than simply creating a cool 60-second clip of a song in a given genre. The technical details of how Suno works (and how it really doesn't work) are boring unless you're going to try Suno yourself, in which case I recommend checking out the Suno Discord server to get a lot of great support.
I didn't need an entire song to make the short demo video but I wanted there to be a real song behind it. As I rerolled and rerolled on Suno searching for a perfect song for the video, I eventually decided that that song would not come into being. It turns out the video itself needed to be edited at a much quicker pace, which I feel fixed the problem in the end.
But in the process, I had created a large number of songs with essentially the same lyrics in a bunch of different styles, which I think is pretty cool. I chose the best sixteen (yes, there were more!) and present them to you here. They vary in quality, but each has at least some element that I find to be exceptionally good given that it came from an AI. My favorites tend toward the top of the gallery above, but your mileage may vary. The very first one, labeled "Rock", is the one used in the final video, now in its complete form.
The album cover for each of the songs was generated in DALLE-3 via Microsoft Copilot, and then heavily edited in Photoshop. The links in the gallery lead to YouTube videos of the songs, and if you'd prefer to just listen to them without clicking through each link on the gallery, you can visit the YouTube playlist. To keep a little visual interest while the songs play, I added animated audio waveforms on the video track which were created with Veed.io and composited in Davinci Resolve.
Enjoy, and if you listen please let me know which ones you like!