Skip to main content

AI can now duplicate anyone's voice based on just one minute of training

ai lyrebird duplicate anyones voice 60965918 l
Ian Allenden/123RF
Do you remember the cool Mission Impossible tech that lets Tom Cruise’s character Ethan Hunt mimic the voice of other characters using some nifty speech synthesis technology?

Well, a Montreal-based startup called Lyrebird (named after the sound-imitating bird) just invented it for real.

Related Videos

“We are developing new speech synthesis technologies which, among other features, allow us to copy the voice of someone with very little data,” Alexandre de Brebisson, one of the PhD students who developed the deep-learning tech behind the project. “Our experiments show that one minute of audio already contains a lot of the DNA of a human voice. We are able to learn a new voice with as little data because our model is able to capture similarities between the new voice and all the voices it already knows. Our models understand the underlying variables that make [one] voice different from another.”

Since the tech was shown off this week, de Brebisson said his team have received dozens of different suggested use-cases by email, some containing applications they’d thought of, and others containing ones that they hadn’t.

Some companies, for example, are interested in letting their users choose to have audio books read in the voice of either famous people or family members. The same is true of medical companies, which could allow people with voice disabilities to train their synthetic voices to sound like themselves, if recorded samples of their speaking voices exist. Another interesting idea is for video game companies to offer the ability for in-game characters to speak with the voice of the human player.

There are plenty more exciting opportunities, which have led to 10,000 people already signing up to be informed of the forthcoming beta version. “We will then add features over time, such as letting companies design a unique voice tailored for their needs, and control the emotion of the [voice] generation,” de Brebisson continued.

While it doesn’t sound perfect yet, it’s not hard to imagine how this might sound in just a few years. Combined with technology such as software for making convincing edits to the moving lips of a person who is speaking, “fake news” circa 2025 should certainly be a whole lot of fun.

Right?

Editors' Recommendations

Update Windows now — Microsoft just fixed several dangerous exploits
Person sitting and using an HP computer with Windows 11.

Microsoft has just released a new patch, and this time around, the update comes with fixes for several dangerous and actively abused vulnerabilities and exploits in Windows.

A total of 68 vulnerabilities were addressed in the patch, many of them critical. Here's what was fixed and how to make sure your Windows device is up to date.

Read more
Microsoft’s DirectStorage can now boost your game loading times by 200%
Person using a gaming monitor.

Microsoft's DirectStorage 1.1 update is here, and it's definitely one that gamers won't want to miss.

With the new version of the API, Microsoft promises up to 200% faster game load times -- all thanks to GPU decompression. Here's how you can try it out for yourself.

Read more
Microsoft Edge now warns when your typos can lead to being phished
Microsoft Defender SmartScreen helps protect users against websites that engage in phishing and malware campaigns.

Microsoft has detailed its latest effort to protect against various types of fraud that can happen via a method as simple as spelling a website URL incorrectly.

The company has announced as of Monday that it is adding website typo protection to its Microsoft Defender SmartScreen service, to aid against web threats such as “typosquatters.” These types of cybercrime can include phishing, malware, and other scams.

Read more