Skip to main content

This AI cloned my voice using just three minutes of audio

There’s a scene in Mission Impossible 3 that you might recall. In it, our hero Ethan Hunt (Tom Cruise) tackles the movie’s villain, holds him at gunpoint, and forces him to read a bizarre series of sentences aloud.

The pleasure of Busby’s company is what I most enjoy,” he reluctantly reads. “He put a tack on Miss Yancy’s chair, and she called him a horrible boy. At the end of the month, he was flinging two kittens across the width of the room ...”

Despite sounding random and unimportant, it quickly becomes clear that the words he’s reading aren’t random at all — they’re deliberately designed to help a software program clone his voice. Once he finishes the passage, the software parses the audio and instantly gives Hunt the ability to speak and sound exactly like the bad guy — the final piece of his near-perfect disguise.

Now if you take that scene and subtract all the espionage, guns, and dramatic tension, you’re left with a pretty solid example of what I experienced at CES today during a demo of My Own Voice, an AI-powered “voice banking” service from a French startup called Acapela Group.

The company’s raison d’être  is to help people who will eventually lose the ability to speak. This is typically something that happens as a result of injury, illness, or diseases like ALS, Huntington’s disease, and laryngeal cancer. Whatever the cause may be, the company’s My Own Voice platform allows a person to synthetically clone their voice and preserve the unique tone, timbre, and personality that makes it theirs — something that’s typically lost with most text-to-speech software (think Stephen Hawking).

Now to be fair, voice cloning tech isn’t necessarily new or technologically groundbreaking at this point. Such services have existed for years, and thanks in part to the advent of deepfakes, there are currently dozens of other companies that can do the same thing that Acapela Group does. But there are two big things that set My Own Voice apart from the rest of the pack: speed and purpose.

My Own Voice is impressively quick. Unlike other services, which often require hours of reference audio to create a realistic-sounding clone, My Own Voice’s AI can spin up an astonishingly good synthetic after hearing just 50 short sentences, or roughly around 3 minutes of recorded audio. It’s basically just like that Mission Impossible scene; they’ve developed a streamlined set of reference sentences that make it easier for their AI to learn how you sound, so instead of manually recording every conceivable word, all you have to do is talk through a handful of easy phrases.

Arguably more important than the software’s speed, though, is its purpose. Again, this tech isn’t particularly new or novel. There have been a handful of noteworthy startups that have spun up similar voice-cloning tech — like Canadian startup Lyrebird or the London-based firm Sonantic, for example. But both of those startups were quickly acquired, and their voice-cloning tech ended up being used for AI overdubbing in movies and video-editing software.

That’s not to say that those aren’t good uses of voice cloning tech. They absolutely are, and they’re probably quite profitable ones to boot — but that’s precisely what makes My Own Voice so cool. It’s not often that you encounter such a powerful technology that, rather than being built for entertainment or productivity, was developed specifically to help disadvantaged people and quite literally give them a voice.

Editors' Recommendations

Drew Prindle
Former Digital Trends Contributor
Drew Prindle is an award-winning writer, editor, and storyteller who currently serves as Senior Features Editor for Digital…
This AI can spoof your voice after just three seconds
man speaking into phone

Artificial intelligence (AI) is having a moment right now, and the wind continues to blow in its sails with the news that Microsoft is working on an AI that can imitate anyone’s voice after being fed a short three-second sample.

The new tool, dubbed VALL-E, has been trained on roughly 60,000 hours of voice data in the English language, which Microsoft says is “hundreds of times larger than existing systems”. Using that knowledge, its creators claim it only needs a small smattering of vocal input to understand how to replicate a user’s voice.

Read more
Digital Trends’ Tech For Change CES 2023 Awards
Digital Trends CES 2023 Tech For Change Award Winners Feature

CES is more than just a neon-drenched show-and-tell session for the world’s biggest tech manufacturers. More and more, it’s also a place where companies showcase innovations that could truly make the world a better place — and at CES 2023, this type of tech was on full display. We saw everything from accessibility-minded PS5 controllers to pedal-powered smart desks. But of all the amazing innovations on display this year, these three impressed us the most:

Samsung's Relumino Mode
Across the globe, roughly 300 million people suffer from moderate to severe vision loss, and generally speaking, most TVs don’t take that into account. So in an effort to make television more accessible and enjoyable for those millions of people suffering from impaired vision, Samsung is adding a new picture mode to many of its new TVs.
[CES 2023] Relumino Mode: Innovation for every need | Samsung
Relumino Mode, as it’s called, works by adding a bunch of different visual filters to the picture simultaneously. Outlines of people and objects on screen are highlighted, the contrast and brightness of the overall picture are cranked up, and extra sharpness is applied to everything. The resulting video would likely look strange to people with normal vision, but for folks with low vision, it should look clearer and closer to "normal" than it otherwise would.
Excitingly, since Relumino Mode is ultimately just a clever software trick, this technology could theoretically be pushed out via a software update and installed on millions of existing Samsung TVs -- not just new and recently purchased ones.

Read more
I used the ChatGPT AI chatbot to do my holiday shopping this year
Tracey Truly used ChatGPT to look up gift ideas for Alan Truly.

ChatGPT has proven to be useful in all sorts of surprising situations, but could the AI chatbot really handle my holiday shopping list?

The challenge came from my wife, Tracey, who enjoys finding flaws with AI and frequently teased our Google Nest and Apple HomePod mini smart speakers over obvious errors. The results this time, however, were impressive, even if I ChatGPT couldn't quite do my shopping unassisted.
For the tech lover who has everything

Read more