Skip to main content

5 ways that future A.I. assistants will take voice tech to the next level


Since Siri debuted on the iPhone 4s back in 2011, voice assistants have gone from unworkable gimmick to the basis for smart speaker technology found in one in six American homes.

“Before Siri, when I talked about [what I do] there were blank stares,” Tom Hebner, head of innovation at Nuance Communications, which develops cutting edge A.I. voice technology, told Digital Trends. “People would say, ‘Do you build those horrible phone systems? I hate you.’ That was one group of people’s only interaction with voice technology.”

That’s no longer the case today. According to eMarketer forecasts, almost 100 million smartphone users will be using voice assistants by 2020. But while A.I. assistants are no longer a novelty, we’re still at the start of their evolution. There’s a long way to go before they fully live up to the promise that voice assistants have as a product category.

Here are five ways in which the technology could improve to make it smarter and more efficient — and help us lead more productive lives as a result. Call them “predictions” or a “wishlist,” these are the challenges that need to be solved.

Mo’ knowledge, less problems

Alexa can tell you what the weather is in Kuala Lumpur, Malaysia; the total number of U.S. dollars you’ll get for 720 South African Rand, and how to spell “disestablishmentarianism.” But consumer A.I. assistants are, in essence, the digital equivalent of a person with a complete set of up-to-date encyclopedias. You get (hopefully) the right information, but there’s no pro-grade level of expertise there.

“The challenge that the systems in your home have is that there’s such a broad range of things that they’re trying to do,” Hebner told Digital Trends.

Image used with permission by copyright holder

This is a tough one to solve, but doing so would be a game-changer. Nuance develops many specialist systems aimed at one specific use-case, such as helping airline customers answer queries or doctors to take notes. Doing so not only means these systems can drill down to get more detailed information, but also means that more intelligence can get baked in. “People were very excited about computers that could understand words, but that doesn’t necessarily matter if you don’t know what to do with those words,” Hebner said.

One example he gives is of a Nuance system that not only understands when doctors read out a list of potential drugs for patients, but could call out potential conflicts. This is way beyond the capabilities of most user-grade A.I. assistants.

However, having a more specialist detailed knowledge of different domains — something hinted at by Alexa Skills — could be transformative. Asking your smart speaker for legal or medical advice sounds, on the face of it, crazy. But there have been extraordinary advances in fields like legal bots, while a recently published report suggests Apple wants Siri to be able to have health-focused conversations with users by 2021.

Specialist knowledge graphs for A.I. assistants are the stuff of sci-fi dreams right now, although a recent report shows just how rapidly virtual assistants’ skillsets are expanding. When skills move into the terrain of specialities, though, we’re going to be in for a treat!

More (and better) personalization

Personalization of today’s smart speakers is still in its infancy. You can change voice assistants’ accent and presenting gender, add or remove skills, and feed it bits of information like your name and place of work. In some cases, you can set up multiple voice profiles so that Google Home will recognize the individual members of your household.

Amazon Echo Show
Image used with permission by copyright holder

But there’s still a long way to go — although the juice should be worth the squeeze. Mattersight Corporation has developed A.I. call center technology, called Predictive Behavioral Routing, which analyzes the speech patterns of callers and matches them up with human operatives with compatible personality types. According to the company, matching a person with a compatible personality will result in a successful call that lasts just half the time, next to that of a person with a conflicting personality type.

Using a similar approach could result in A.I. assistants which talk back to you the way you like to be addressed. That could be something as simple as matching the accent and voice volume of the person they’re speaking with. Or it could change the way it addresses ideas by perhaps using more emotive words for some users, compared to more dense detailed information it could use for others. Maybe some people want a voice assistant to chat to at length, while others simply want one to convey the necessary information in the most concise manner possible. A.I. assistants should be capable of both.

Technologies like Google Duplex show just how convincingly accurate A.I.-generated synthesized voices and conversations are getting. As A.I.s move into areas more complex than dishing up song requests and food timers, expect to see this technology to play a major role.

This could be aided by breakthroughs in the ability to identify users by voice. Hebner notes that Nuance’s technology can ID users from just a single solitary second of audio. “It used to take 10 seconds to understand who you are, to get an accurate signal,” he said. “The power of that is significant.” Being able to identify users by a small snippet of voice solves the password problem, and opens up the opportunity to use voice assistants for more delicate confidential information.

Getting proactive

A good assistant will do something when you ask them to. A great assistant won’t need asking. Right now, A.I. assistants are still at this first stage. Users can get the song they want or the reminder they need, but typically only when it’s been explicitly requested. As people get more comfortable with voice assistants, there’s a great opportunity for them to move beyond being purely reactive devices to proactive ones.

There are big questions about whether or not people want to hand certain jobs over to machines.

How would you feel about an A.I. assistant making decisions on your behalf? These could be anything from cranking up the thermostat when someone says they’re cold or rebooking a lunch meeting because you’re running late, to nudging you to do more exercise or get better at saving your paycheck. As more and more smart devices make their way into the home, the number of things a voice assistant could conceivably command will greatly increase.

Part of this is a social question about how comfortable people are about machines making decisions on their part. There are big questions about whether or not people want to hand certain jobs over to machines. Think of it like giving your credit card and house keys to your flesh-and-blood assistant — only with a much bigger sprinkling of Skynet. The downside is giving up a certain amount of control. The potential upside is increasing your free time. Of course, there is a big technical challenge…

It’s all about the feedback

Tom Hebner pointed out a big challenge with the issue of proactivity: how do our machines know when they’ve got it right? Returning to the idea of the good vs. great assistant, a great assistant might have all your files out ahead of a big meeting, without you needing to ask. But what if they’re the wrong files? A big issue with making home A.I. assistants more proactive is that there are currently limited ways of revealing whether or not we’re getting the information is the right information.

A.I. is good pepper the robot
Tomohiro Ohsumi/Getty Images

“If I ask for the same song every day when I walk into my house, and then day I walk in and it just starts playing, how do they know that they got it right?” Hebner said. “If I don’t stop it playing, does that mean it’s right? If I do say ‘stop,’ does that mean it got it wrong and it should never do it again? The feedback mechanism is one of the reasons you’re not getting more proactive systems.”

This is a challenging one for engineers to figure out. Anyone who’s ever had an intern asking them for instruction and feedback on every single task knows that sometimes it’s easier to do a job yourself than delegate it. An A.I. assistant is there to make your life more frictionless; not to give you dozens of mini surveys each day to confirm if it’s done its job right. This will need to be solved in a way that’s not crippling to the user friendliness of these devices, and doesn’t require a whole lot of training up front before systems learn your preferences.

What’s the answer? I’m not sure. But, as Steve Jobs once said, it’s not the job of the customer to figure it out.

New interaction methods

There’s a scene in 2001: A Space Odyssey in which the murderous HAL 9000, disconcertingly still the most famous fictional A.I. assistant in history, reveals that it doesn’t just use microphones to determine what is being said to it. When two crew members try and choose a location to speak where they know HAL can’t hear, HAL reveals that he can still understand them, based on reading their lip movement.

2001: A Space Odyssey Image used with permission by copyright holder

Scary moment of the movie? Sure. An example of how A.I. assistants could work in the future? Um, sure!

The idea that voice assistants should be limited to voice diminishes the possible number of ways they could usefully interact with us. With the rise of facial recognition and emotion-tracking technologies, an ever-growing number of biometrics gathered about users on a constant basis, and even the possibility of mind-reading tech on the horizon, there are plenty of different signals which could be used by A.I. assistants to draw their conclusions.

The idea that, 10 years from now, we’ll only be using voice to control these A.I. assistants is like looking at PCs in the early 80s and thinking we’ll never have more than a keyboard at our disposal.

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Digital Trends’ Tech For Change CES 2023 Awards
Digital Trends CES 2023 Tech For Change Award Winners Feature

CES is more than just a neon-drenched show-and-tell session for the world’s biggest tech manufacturers. More and more, it’s also a place where companies showcase innovations that could truly make the world a better place — and at CES 2023, this type of tech was on full display. We saw everything from accessibility-minded PS5 controllers to pedal-powered smart desks. But of all the amazing innovations on display this year, these three impressed us the most:

Samsung's Relumino Mode
Across the globe, roughly 300 million people suffer from moderate to severe vision loss, and generally speaking, most TVs don’t take that into account. So in an effort to make television more accessible and enjoyable for those millions of people suffering from impaired vision, Samsung is adding a new picture mode to many of its new TVs.
[CES 2023] Relumino Mode: Innovation for every need | Samsung
Relumino Mode, as it’s called, works by adding a bunch of different visual filters to the picture simultaneously. Outlines of people and objects on screen are highlighted, the contrast and brightness of the overall picture are cranked up, and extra sharpness is applied to everything. The resulting video would likely look strange to people with normal vision, but for folks with low vision, it should look clearer and closer to "normal" than it otherwise would.
Excitingly, since Relumino Mode is ultimately just a clever software trick, this technology could theoretically be pushed out via a software update and installed on millions of existing Samsung TVs -- not just new and recently purchased ones.

Read more
AI turned Breaking Bad into an anime — and it’s terrifying
Split image of Breaking Bad anime characters.

These days, it seems like there's nothing AI programs can't do. Thanks to advancements in artificial intelligence, deepfakes have done digital "face-offs" with Hollywood celebrities in films and TV shows, VFX artists can de-age actors almost instantly, and ChatGPT has learned how to write big-budget screenplays in the blink of an eye. Pretty soon, AI will probably decide who wins at the Oscars.

Within the past year, AI has also been used to generate beautiful works of art in seconds, creating a viral new trend and causing a boon for fan artists everywhere. TikTok user @cyborgism recently broke the internet by posting a clip featuring many AI-generated pictures of Breaking Bad. The theme here is that the characters are depicted as anime characters straight out of the 1980s, and the result is concerning to say the least. Depending on your viewpoint, Breaking Bad AI (my unofficial name for it) shows how technology can either threaten the integrity of original works of art or nurture artistic expression.
What if AI created Breaking Bad as a 1980s anime?
Playing over Metro Boomin's rap remix of the famous "I am the one who knocks" monologue, the video features images of the cast that range from shockingly realistic to full-on exaggerated. The clip currently has over 65,000 likes on TikTok alone, and many other users have shared their thoughts on the art. One user wrote, "Regardless of the repercussions on the entertainment industry, I can't wait for AI to be advanced enough to animate the whole show like this."

Read more
4 simple pieces of tech that helped me run my first marathon
Garmin Forerunner 955 Solar displaying pace information.

The fitness world is littered with opportunities to buy tech aimed at enhancing your physical performance. No matter your sport of choice or personal goals, there's a deep rabbit hole you can go down. It'll cost plenty of money, but the gains can be marginal -- and can honestly just be a distraction from what you should actually be focused on. Running is certainly susceptible to this.

A few months ago, I ran my first-ever marathon. It was an incredible accomplishment I had no idea I'd ever be able to reach, and it's now going to be the first of many I run in my lifetime. And despite my deep-rooted history in tech, and the endless opportunities for being baited into gearing myself up with every last product to help me get through the marathon, I went with a rather simple approach.

Read more