Skip to main content

A.I. system seeks to turn thoughts of people unable to talk into speech

For the first time, neuroengineers have developed a system capable of translating thoughts directly into recognizable speech, marking an important step toward more advanced brain-computer interfaces for people who lack the ability to speak.

The system, which was created by researchers at Columbia University, works by monitoring a person’s brain activity, identifying brain signals, and reconstructing the words the individual hears. Powered by speech synthesizers and artificial intelligence, the technology lays the groundwork for helping individuals who are unable to speak due to disability regain their capacity to communicate verbally.

“Our ultimate goal is to develop technologies that can decode the internal voice of a patient who is unable to speak, such that it can be understood by any listener,” Nima Mesgarani, an electrical engineer at Columbia University who led the project, told Digital Trends by email.

Parts of the brain light up like a Christmas tree — neurons firing left and right — when people speak or even simply think about speaking. Neural researchers have long endeavored to decode the patterns that emerge in these signals. But it isn’t easy. For years, scientists like Mesgarani have tried to translate brain activity to intelligible thought, using tools like computer models to analyze visual representations of sound frequencies.

In their recent work, Mesgarani and his team used a computer algorithm called a vocoder, which can generate speech-like sounds when trained on recordings of human speech. But to train the vocoder, Mesgarani needed brain models, so he partnered with Ashesh Dinesh Metah, a neurosurgeon at Northwell Health Physician Partners Neuroscience Institute in New York who treats epilepsy patients.

Mesgarani and Metah asked some of Metah’s patients to listen to speech recordings and measured their brain activity. The patterns in their brain activity trained the vocoder. The researchers then recorded the patients’ brain activity as they listened to people count to nine, which the vocoder attempted to recite by analyzing the neural signals.

The result isn’t perfect. The sounds it produces are robotic and, even after an A.I. system was used to “clean up” the vocoder to more intelligible levels, vaguely recognizable. But the researchers found that individuals could understand and repeat the sounds about 75 percent of the time.

Moving forward, the researchers plan to trial more complicated words before moving on to sentences. Their end goal is to integrate the system into an implant that could translates the thoughts directly into words.

A paper detailing the research was published last month in the journal Scientific Reports.

Editors' Recommendations

Dyllan Furness
Dyllan Furness is a freelance writer from Florida. He covers strange science and emerging tech for Digital Trends, focusing…
Neuro-symbolic A.I. is the future of artificial intelligence. Here’s how it works
IBM Watson Shapes

Picture a tray. On the tray is an assortment of shapes: Some cubes, others spheres. The shapes are made from a variety of different materials and represent an assortment of sizes. In total there are, perhaps, eight objects. My question: “Looking at the objects, are there an equal number of large things and metal spheres?”

It’s not a trick question. The fact that it sounds as if it is is proof positive of just how simple it actually is. It’s the kind of question that a preschooler could most likely answer with ease. But it’s next to impossible for today’s state-of-the-art neural networks. This needs to change. And it needs to happen by reinventing artificial intelligence as we know it.

Read more
Surveillance on steroids: How A.I. is making Big Brother bigger and brainier
ai taking facial recognition next level skylark labs plaza post detection

It’s no big secret that we live in a surveillance state. The average American is caught on CCTV camera an estimated 75 times a day. Meanwhile an average Londoner, the world’s most photographed person, is snapped on public and private security cameras an estimated 300 times every 24 hours.

But if you thought that the future was just more cameras in more places, you’d better think again. Thanks to breakthroughs in fields like computer vision, tomorrow’s surveillance society is going to be a whole lot more advanced. The amount of information which can be extracted from video footage is increasing all the time. As a result, instead of simply static recordings made for future reference, or live feeds viewed by bored workers watching banks of monitors at the same time, CCTV is getting smarter. Way smarter. I'm talking multiple orders of magnitude smarter, and a whole lot more mobile, too.
The eye in the sky
One such new technology, the so-called Aerial Suspicious Analysis (ASANA), is the work of computer vision company Skylark Labs. ASANA is a drone-based security system, designed to spot suspicious activity in crowds of hundreds of people from anywhere between 10 feet and 300 feet. Drones equipped with its cloud-based technology can hover over large gatherings of people and accurately identify behavior such as fighting, kicking, stabbing, fighting with weapons, and more. That’s alongside non-suspicious activities like high-fiving, dancing, and hugging, used to make the system more robust to false alerts. The system is also being modified to spot suspicious objects, such as unattended bags. Alerts can then be sent to the relevant authorities.

Read more
Mind-reading A.I. analyzes your brain waves to guess what video you’re watching
brain control the user interface of future eeg headset

Neural networks taught to "read minds" in real time

When it comes to things like showing us the right search results at the right time, A.I. can often seem like it’s darn close to being able to read people’s minds. But engineers at Russian robotics research company Neurobotics Lab have shown that artificial intelligence really can be trained to read minds -- and guess what videos users are watching based entirely on their brain waves alone.

Read more