Skip to main content

AI assistants will soon recognize and respond to the emotion in your voice

emotion
Konstantynov/123RF
You know when people say that it’s not what you say, but how you say it that matters? Well, very soon that could become a part of smart assistants such as Amazon’s Alexa or Apple’s Siri. At least, it could if these companies decide to use new technology developed by emotion tracking artificial intelligence company Affectiva.

Affectiva’s work has previously focused on identifying emotion in images by observing the way that a person’s face changes when they express particular sentiments. Affectiva’s latest technology builds on that premise through the creation of a cloud-based application program interface (API) that is able to detect emotion in speech. Developed using the power of deep learning technology, the smart tech is capable of observing changes in tone, volume, speed, and voice quality and using this to recognize emotions like anger, laughter, and arousal in recorded speech.

Recommended Videos

“The addition of Emotion AI for speech builds on Affectiva’s existing emotion recognition technology for facial expressions, making us the first AI company to allow for a person’s emotions to be measured across face and speech,” Rana el Kaliouby, co-founder and CEO of Affectiva, told Digital Trends. “This is all part of a larger vision that we have. People sense and express emotion in many different ways: Through facial expressions, voice, and gestures. We’ve set out to develop multi-modal Emotion AI that can detect emotion the way humans do from multiple communication channels. The launch of Emotion AI for speech takes us one step closer.”

Affectiva Overview

Affectiva developed its voice recognition system by collecting naturalistic speech data from a variety of sources, including commercially available databases. This data was then labeled by human experts for the occurrence of what the company calls “emotion events.” These human generated labels were used to train and validate the team’s deep learning models, so that over time it grew to understand how certain shifts in a person’s voice might indicate a particular emotion.

Please enable Javascript to view this content

It’s smart stuff from a technology perspective but, like the best technology, it also has the possibility of helping users on a practical basis. One specific application could include car navigation systems that are able to hear a driver start to experience road rage, and react to prevent them from making a rash driving decision. It could similarly be used to allow automated assistants to change their approach when they hear anger or frustration from a user — or to learn what kind of responses elicit the best reactions and repeat these strategies.

Luke Dormehl
Former Digital Trends Contributor
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Amazon plans to spend an estimated $100 billion on AI in 2025
AWS sign in Javitz Center NYC.

AWS sign in Javitz Center NYC. Fionna Agomuoh / Digital Trends

Amazon spent $26.3 billion in capital expenditures during the fourth quarter of 2024, and that is "reasonably representative" of what it plans to spend each quarter of 2025, CEO Andy Jassy said during the company's Q4 earnings call on Thursday. The "vast majority" of that spending will reportedly go towards Amazon Web Services and AI development.

Read more
Turns out, it’s not that hard to do what OpenAI does for less
OpenAI's new typeface OpenAI Sans

Even as OpenAI continues clinging to its assertion that the only path to AGI lies through massive financial and energy expenditures, independent researchers are leveraging open-source technologies to match the performance of its most powerful models -- and do so at a fraction of the price.

Last Friday, a unified team from Stanford University and the University of Washington announced that they had trained a math and coding-focused large language model that performs as well as OpenAI's o1 and DeepSeek's R1 reasoning models. It cost just $50 in cloud compute credits to build. The team reportedly used an off-the-shelf base model, then distilled Google's Gemini 2.0 Flash Thinking Experimental model into it. The process of distilling AIs involves pulling the relevant information to complete a specific task from a larger AI model and transferring it to a smaller one.

Read more
Sundar Pichai says even more AI is coming to Google Search in 2025
Google Search on a laptop

Google will continue to go all in on AI in 2025, CEO Sundar Pichai announced during the company's Q4 earnings call Wednesday. Alphabet shares have since dropped more than 7% on news that the company giant fell short of fourth-quarter revenue expectations and announced an ambitious spending plan for its AI development.

"As AI continues to expand the universe of queries that people can ask, 2025 is going to be one of the biggest years for search innovation yet,” he said during the call. Pichai added that Search is on a “journey” from simply presenting a list of links to offering a more Assistant-like experience. Whether users actually want that, remains to be seen.

Read more