Skip to main content

This humanlike synthesized speech could be the future of audiobooks

Synthesized voices like those used by Siri and Alexa are fine for telling us the day’s weather forecast or how many minutes remain on a cooking timer, but would you really want their flat, monotonous tones reading you audiobooks? Probably not, which is why most of us turn to human-voiced services like Audible to get our audiobook fix. Human voice actors might not get the nod for too much longer, however, due to to the pioneering work of a London-based startup called DeepZen.

Using artificial intelligence algorithms, augmented by the technological firepower of IBM’s Power A.I. and Watson technologies, DeepZen has developed text-to-speech tools that not only sound human at first listen, but can also pick up on the emotional cues needed for reading text in a compelling manner. In doing so, the company claims that it could reduce the time and cost to produce audiobooks by up to 90%.

Recommended Videos

“Our system is truly revolutionary,” Taylan Kamis, CEO and co-founder of DeepZen, told Digital Trends. “It works using deep learning and neural networks to understand how a human talks and reads. We then train the system so it can recognize where to apply the right emotions and intonation when reading a piece of text. The result is humanlike speech very closely resembling the real thing.”

Inevitably, work like this can be cast as yet another example of cutting-edge A.I. tools threatening a human profession. In this case, that profession involves actors who, despite what a few high-profile figures are able to achieve, don’t have the most steady, stable careers as it is. It would be naive to think that software such as this won’t have an impact on the future of voice actors, but, as Kamis points out, there are plenty of scenarios in which tools such as DeepZen’s could be a net positive for humanity.

For example, it could make possible the creation of audiobooks based on works by new and emerging writers, or from publishers who don’t have the luxury of big budgets. It could also be used to help develop superior text-to-speech tools for people who have dyslexia or otherwise have trouble reading.

“As for the future, we are also looking at producing voice-overs for the video production industry, as well as gaming, where there is a need for real-time text-to-speech to enhance the player experience,” Kami said. “We are also looking at other languages.”

You can check out a sample of the system here.

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
The next big role for ChatGPT could be… a brownie expert?
Depiction of a brownie and ChatGPT.

OpenAI’s ChatGPT tool has found widespread adoption, from assisting with academic work and domain-specific deep research to speeding up drug discovery. People are also loving its Ghibli image generation so much that the user load is “melting” OpenAI’s GPU stack. The next major avenue for ChatGPT could be quite a delicious adventure.
Specifically, the AI chatbot could speed up the sensory testing of brownies, potentially speeding up the development of new flavors, too. The folks over at the University of Illinois Urbana-Champaign recently published a study analyzing the potential of ChatGPT as a sensory taster for various types of brownies.

A whole new role
In the food industry, expert sensory evaluation is a huge thing. Technically referred to as organoleptic, it’s all about studying the impact of food items on various human senses. Think of taste, smell, sight, touch, texture, and even the sound. It is then tied to the emotional and sentimental side of tasting a certain food item.

Read more
iOS 18’s best AI tools arrive in December, but Siri has a longer wait
Apple Intelligence on iPhone 15 Pro.

The Apple Intelligence toolkit has witnessed a staggered mix of delayed features and underwhelming perks. But it seems that the most promising set of those AI tools that Apple revealed at WWDC earlier this year is right around the corner.

In the latest edition of his PowerOn newsletter, Bloomberg’s Mark Gurman writes that the iOS 18.2 update will start rolling out via the stable channel in the first week of December.

Read more
This AI algorithm could save lives in quake zones
An urban area devastated by an earthquake.

Powerful earthquakes in urban areas can cause a shocking amount of devastation, with lives lost and buildings destroyed. Indeed, more than 60,000 people have already died in such events this year alone.

Ever since scientists discovered what causes these awful catastrophes, they’ve also been trying to predict them in a bid to save lives and reduce damage. But the way in which tectonic plates behave as pressure builds up between them makes the task of forecasting earthquakes incredibly difficult.

Read more