Skip to main content

New voice ID system reads throat vibrations to determine who’s speaking

vauth voice id 1
University of Michigan
We live in a world in which technology that can recognize and respond to spoken words, whether it’s smart speakers like Google Home or dictation services on our PCs, is an everyday matter. But while there have been attempts to make systems that will respond only to one person’s voice, these are not infallible.

That’s an issue that a new research project carried out at the University of Michigan sets out to resolve. Investigators there have created an authentication technology called VAuth, which promises accurate person-specific voice recognition that no talented impressionist, recording or voice-mimicking A.I. is going to be able to get past. It does this via a wearable accessory — which currently takes the form of either a necklace, ear buds, or a glasses attachment — and that uses an accelerometer to measure the subtle skin vibrations in a person’s face, throat, or chest when they talk. It’s these vibrations, combined with the sound, which then provides a unique identifier for each person. In other words, it doesn’t just confirm that these are your own words, but that they are emerging from your own throat, too.

“VAuth does not rely on voice identification, rather it complements voice identification technologies by providing a physical assurance using the vibrations collected from the user’s body,” Kang Shin, a professor in the electrical engineering and computer science department, told Digital Trends. “That’s a departure from relying only on a voice biometric which, similar to a fingerprint, is not easy to keep protected. From a few recordings of the user’s voice, an attacker can impersonate the user by generating a matching ‘voice print,’ such as WaveNet from DeepMind. In such a case, the users can do little to regain their security as they cannot simply change their voice.”

In tests involving 18 users and 30 different vocal commands, VAuth was able to demonstrate an accuracy of 97 percent, regardless of a person’s accent, language, or whether they were moving at the time. It also fended off tests to spoof it — such as playback of a person’s actual voice or impersonations.

“There are already many commercial voice assistant products and services, and we expect a lot more to come in future,” Shin continued. “A solution like VAuth is therefore essential. We have built a prototype using off-the-shelf components, which were not designed for our purpose, and tested the VAuth functionality and accuracy. [Next] we are planning to commercialize VAuth. To this end, we are planning to miniaturize the device, and conduct a more thorough evaluation for its robustness and accuracy for various use-cases and environments.”

Editors' Recommendations