Skip to main content

Will computerized voices ever sound human?

siri upgrade vocaliq ios smartphone iphone apple
Kārlis Dambrāns/Flickr
Sounding robotic has never been a compliment, but with the right amount of tinkering, computer scientists and engineers are hoping that this may soon change. Computerized voices haven’t quite hit the mark yet on sounding … human, and it’s the subject of one of the tech industry’s latest major efforts. As an increasing number of devices begin speaking to us — from Apple’s Siri to Amazon’s Alexa to our GPS system, it’s becoming increasingly important for machines to have voices we actually want to listen to.

As the New York Times reports, the relatively new focus area of “conversational agents” in the little-understood field of human-computer interaction design, seeks to build programs that understand language and are also able to respond to commands. Today, it is impossible for a computer’s voice to be rendered indistinguishable from that of a human’s. At least, not for anything more complex than offering short bits of information — whether it’ll rain, for example, or when to turn left.

Part of the issue lies in “prosody,” which is the capacity to correctly enunciate or stress certain syllables — saying words the way an actual human would. And of course, there’s also the uniquely human ability to add emotion into pronunciation. After all, we don’t always say “good” or even “left” in the same way. Machines, on the other hand, have yet to master that nuance.

“The problem is we don’t have good controls over how we say to these synthesizers, ‘Say this with feeling,’” Scottish computer scientist and Carnegie Mellon professor Alan Black told the Times. And it may still be some time before we’re actually able to do this at all.

But that might not be a bad thing, some say. “Jarring is the way I would put it,” Brian Langner, senior speech scientist at digital speech company ToyTalk, said about having machines sound too much like humans. “When the machine gets some of those things correct, people tend to expect that it will get everything correct.”

So no, you probably won’t be able to get Siri to sound like your mom anytime soon. But you may want to enjoy that inability while you can.

Editors' Recommendations

Lulu Chang
Former Digital Trends Contributor
Fascinated by the effects of technology on human interaction, Lulu believes that if her parents can use your new app…
Prosthetics that don’t require practice: Inside the latest breakthrough in bionics
most advanced hand prosthetic

Paul Cederna dreams of a hand for every occasion.

“I can imagine somebody that has this entire suite of hands,” he said. “They’re a farmer, and they’re working on their tractor and welding and harvesting the corn -- and they’ve got this heavy-duty hand that is incredibly durable, which can open and close and lift heavy, heavy things. But the farmer also happens to be a pianist. When they go inside, they put on another super lightweight hand where the fingers spread and move really fast. All this hand needs to do is to push piano keys to play the piano.”

Read more
Facebook will pay users $5 for their voice memo to improve speech detection
is facebook working on a messenger assistant powered by real people

Nearly six months after Facebook admitted to listening in on its users’ audio messenger chats, the company is now offering to pay for them. 

Facebook announced Thursday it plans to pay some users up to $5 for voice memos in an effort to better develop its speech recognition technology. 

Read more
Human Screenome Project wants you to share everything you do on your smartphone
Child Using Smartphone

You’ve almost certainly seen them on YouTube. “Noah takes a photo of himself every day for 20 years” (5 million views.) “Portrait of Lotte, 0 to 20 years” (10.9 million views.) “Age 12 to married -- I took a photo every day” (an astonishing 110 million views.) Heck, even Homer Simpson and Family Guy’s Peter Griffin have parodied the format. In an age of selfies and ubiquitous smartphone cameras, this increasingly popular genre of time-lapse videos depicting the aging process lets people self-chronicle their lived experiences in a quintessentially modern way that would have been all but impossible just a couple of decades ago.

But what if the bigger story wasn’t some YouTube star’s changing facial features, but rather the fact that tens of millions of us would dedicate minutes of our day to watching them? And, maybe after that, tweeted out a link to the video we’d just watched. Or sent it to a buddy on WhatsApp. Or fired up the camera app on our own smartphone and started making our own version. Or just forgot about what we’d just watched entirely, and played a quick game on Mario Kart Tour.

Read more