Voice cloning company ElevenLabs is shaping the future of AI in content creation by officially launching the new Text to Sound Effects tool that lets users create sound effects for their creative projects.
The tool, which was initially announced in February, can generate up to 22-second sound effects from basic user prompts like “dog barking,” “creaking doors,” “thunderstorm,” and “loud car engine.” It can also provide users with music tracks like guitar loops, jazz saxophone solos, and music techno loops. These sounds can be combined with the company’s voice and music platform, providing users with at least four downloadable audio clip options.
“Over the last year, we’ve revolutionised AI Voices by producing the first truly emotive, human-like text-to-speech platform,” Mati Staniszewski, ElevenLabs’ co-founder and CEO, said in a statement. “With the launch of text-to-sound effects, we’re marking another major step forward, one that will equip creators with more audio tools to help them produce high-quality content.”
ElevenLabs emphasises that the tool is meant to aid people, including those in the filmmaking and entertainment industries, content creators, and developers, in generating the rich soundscapes required to bring their podcasts, games, and movies to life “quickly, affordably, and at scale.”
ElevenLabs utilises advanced machine learning (ML) algorithms to interpret text descriptions of sounds and convert them into realistic sound effects. This enables creators to effortlessly produce sounds based on the type of project they’re working on without having to search through libraries or record from scratch.
Free users receive 10,000 character generations per month, with each sound byte generation taking approximately 150 characters per request. This allows free-tier users to create nearly 60 sound effects per month. Additionally, when publishing any content containing these sound clips, users must attribute the sound to “elevenlabs.io” in the title.
ElevenLabs trained its model using Shutterstock’s audio library, which contains licensed tracks. The company also noted that some users, including video game developers, social media content creators, marketers, and film producers, tried out the tool during its alpha testing phase.
The startup also mentioned that the tool does not permit sound generation through prompts that breach its Prohibited Content and Uses Policy, which covers topics such as self-harm, threats to child safety, and fraud.
Other AI developers are also working on creating their own text-to-sound generators. OpenAI has its Jukebox, Meta has AudioCraft, and Google has successfully launched MusicFX.