Skip to main content

OpenAI needs just 15 seconds of audio for its AI to clone a voice

In recent years, the listening time required by a piece of AI to clone someone’s voice has been getting shorter and shorter.

It used to be minutes, now it’s just seconds.

OpenAI, the Microsoft-backed company behind the viral generative AI chatbot ChatGPT, recently revealed that its own voice-cloning technology requires just 15 seconds of audio material to reproduce someone’s voice.

In a post on its website, OpenAI shared a small-scale preview of a model called Voice Engine, which it’s been developing since late 2022.

Voice Engine works by feeding it a minimum of 15 seconds of spoken material. The user can then input text to create what OpenAI describes as “emotive and realistic” speech that “closely resembles the original speaker.”

OpenAI insists it is taking a “cautious and informed approach to a broader release due to the potential for synthetic voice misuse,” adding that it wants to “start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities.”

It added: “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

One of the misuses that OpenAI refers to is a scam that some criminals are already carrying out using similar technology that’s been publicly available for some time. It involves cloning a voice and then calling a friend or relative of that person to trick them into handing over cash via a bank transfer. There are also fears about how such technology might be used in the upcoming presidential election, an issue highlighted by a recent high-profile incident in which a robocall using a clone of President Joe Biden’s voice told people not to vote in January’s New Hampshire primary.

Another concern is how the rapidly improving technology will impact the livelihoods of voice actors who fear that they’ll be increasingly asked to sign over the rights to their voice so that AI can be used to create a synthetic version, with compensation for such a contract likely to be much lower than if the actor was asked to perform the job in person.

Looking at more positive deployments of the technology, OpenAI suggests that it could be used to provide reading assistance to non-readers and children using natural-sounding, emotive voices “representing a wider range of speakers than what’s possible with preset voices,” as well as instant translation of videos and podcasts, something that Spotify is already trialing.

It could also be used to help patients who are gradually losing their voice through illness to continue communicating using what sounds like their own voice.

OpenAI has some examples of the AI-generated audio and the reference audio on its website, and we’re sure you’ll agree that they’re pretty extraordinary.

Editors' Recommendations

Trevor Mogg
Contributing Editor
Not so many moons ago, Trevor moved from one tea-loving island nation that drives on the left (Britain) to another (Japan)…
Is ChatGPT safe? Here are the risks to consider before using it
A response from ChatGPT on an Android phone.

For those who have seen ChatGPT in action, you know just how amazing this generative AI tool can be. And if you haven’t seen ChatGPT do its thing, prepare to have your mind blown! 

There’s no doubting the power and performance of OpenAI’s famous chatbot, but is ChatGPT actually safe to use? While tech leaders the world over are concerned over the evolutionary development of AI, these global concerns don’t necessarily translate to an individual user experience. With that being said, let’s take a closer look at ChatGPT to help you hone in on your comfort level.
Privacy and financial leaks
In at least one instance, chat history between users was mixed up. On March 20, 2023, ChatGPT creator OpenAI discovered a problem, and ChatGPT was down for several hours. Around that time, a few ChatGPT users saw the conversation history of other people instead of their own. Possibly more concerning was the news that payment-related information from ChatGPT-Plus subscribers might have leaked as well.

Read more
What is ChatGPT Plus? Here’s what to know before you subscribe
Close up of ChatGPT and OpenAI logo.

ChatGPT is completely free to use, but that doesn't mean OpenAI isn't also interested in making some money.

ChatGPT Plus is a subscription model that gives you access to a completely different service based on the GPT-4 model, along with faster speeds, more reliability, and first access to new features. Beyond that, it also opens up the ability to use ChatGPT plug-ins, create custom chatbots, use DALL-E 3 image generation, and much more.
What is ChatGPT Plus?
Like the standard version of ChatGPT, ChatGPT Plus is an AI chatbot, and it offers a highly accurate machine learning assistant that's able to carry out natural language "chats." This is the latest version of the chatbot that's currently available.

Read more
ChatGPT shortly devolved into an AI mess
A response from ChatGPT on an Android phone.

I've seen my fair share of unhinged AI responses -- not the least of which was when Bing Chat told me it wanted to be human last year -- but ChatGPT has stayed mostly sane since it was first introduced. That's changing, as users are flooding social media with unhinged, nonsensical responses coming from the chatbot.

In a lot of reports, ChatGPT simply spits out gibberish. For example, u/Bullroarer_Took took to the ChatGPT subreddit to showcase a response in which a series of jargon and proper sentence structure gives the appearance of a response, but a close read shows the AI spitting out nonsense.

Read more