Parabolica #18 - Latest AI Audio Capabilities
AI Audio is making waves in main stream news with the viral Drake and The Weeknd song. Grimes mentioned sharing 50% of profits generated from any AI generated song with her voice.
I covered AI audio in this post a few months ago:
This space is also evolving fast and I wanted to take a look at it again to see how much it has improved.
Before we begin, it’s important to distinguish AI audio and place them in two different buckets:
voice2voice
text2audio
You can think of it like AI video where:
voice2voice = video2video (e.g. Deforum video input, batch img2img, etc)
text2audio = text2video (e.g. Runway’s Gen-2)
If you want something perfect, you’ll have to input a source audio or video and apply a style to it.
Pure text2audio or text2video is close!
Recent Hip-Hop AI voice2voice examples
In these examples, the creator rapped to the track themselves and used a fine-tuned model to sound like the artists:
Text2Audio
These examples use a text prompt to instruct the AI and give it direction:
Bark AI
https://github.com/suno-ai/bark
“I’ve got a secret to tell you. I can pass the Turing test.”
“I got a face for radio and like, what do they say? A voice for — um — for print? [laughs]”
Tango AI
https://github.com/declare-lab/tango
“A man speaks followed by a popping noise and laughter”
“People cheering in a stadium while rolling thunder and lightning strikes”
Google’s MusicLM
https://google-research.github.io/seanet/musiclm/examples/
“Slow tempo, bass-and-drums-led reggae song. Sustained electric guitar. High-pitched bongos with ringing tones. Vocals are relaxed with a laid-back feel, very expressive.”
MusicLM can also transfer audio styles! You can give it an audio source and tell it to apply the ‘guitar solo’ style.
Original humming:
‘guitar solo’
‘tribal drums and flute’
Takeaways for growth marketers
AI voices have been good enough for months to use in ads. You can use tools like ElevenLabs.io, Murf.ai, or Play.ht.
Soon enough you will be able to create music, sound effects, laughter, etc with text prompts. I’m guessing in a few weeks to a month.
You can use AI to generate audio for your entire ad. From voiceover, to music, to unboxing sounds.
You won’t have complete control over the audio output but it should have an incremental lift in ad performance.
You can test by duplicating your ad and adding the AI audio tracks.
If you enjoyed this post you can help me out by sharing it with someone you think would find it valuable!
Thanks,
Manson