For the first time, neuroengineers have developed a system capable of translating thoughts directly into recognizable speech, marking an important step toward more advanced brain-computer interfaces for people who lack the ability to speak.
The system, which was created by researchers at Columbia University, works by monitoring a person’s brain activity, identifying brain signals, and reconstructing the words the individual hears. Powered by speech synthesizers and artificial intelligence, the technology lays the groundwork for helping individuals who are unable to speak due to disability regain their capacity to communicate verbally.
“Our ultimate goal is to develop technologies that can decode the internal voice of a patient who is unable to speak, such that it can be understood by any listener,” Nima Mesgarani, an electrical engineer at Columbia University who led the project, told Digital Trends by email.
Parts of the brain light up like a Christmas tree — neurons firing left and right — when people speak or even simply think about speaking. Neural researchers have long endeavored to decode the patterns that emerge in these signals. But it isn’t easy. For years, scientists like Mesgarani have tried to translate brain activity to intelligible thought, using tools like computer models to analyze visual representations of sound frequencies.
In their recent work, Mesgarani and his team used a computer algorithm called a vocoder, which can generate speech-like sounds when trained on recordings of human speech. But to train the vocoder, Mesgarani needed brain models, so he partnered with Ashesh Dinesh Metah, a neurosurgeon at Northwell Health Physician Partners Neuroscience Institute in New York who treats epilepsy patients.
Mesgarani and Metah asked some of Metah’s patients to listen to speech recordings and measured their brain activity. The patterns in their brain activity trained the vocoder. The researchers then recorded the patients’ brain activity as they listened to people count to nine, which the vocoder attempted to recite by analyzing the neural signals.
The result isn’t perfect. The sounds it produces are robotic and, even after an A.I. system was used to “clean up” the vocoder to more intelligible levels, vaguely recognizable. But the researchers found that individuals could understand and repeat the sounds about 75 percent of the time.
Moving forward, the researchers plan to trial more complicated words before moving on to sentences. Their end goal is to integrate the system into an implant that could translates the thoughts directly into words.
A paper detailing the research was published last month in the journal Scientific Reports.
- Analog A.I.? It sounds crazy, but it might be the future
- The funny formula: Why machine-generated humor is the holy grail of A.I.
- Nvidia’s latest A.I. results prove that ARM is ready for the data center
- Nvidia’s new voice A.I. sounds just like a real person
- Nvidia lowers the barrier to entry into A.I. with Fleet Command and LaunchPad