Skip to main content

Facebook A.I. could fix one of the most annoying problems in video chat apps

Communication on Facebook might be predominantly carried out via text, but the social media giant may nonetheless help to solve some of the biggest challenges with audio communication. Announced on Friday, July 10, ahead of the International Conference on Machine Learning, Facebook has developed a new, cutting-edge artificial intelligence that’s able to distinguish up to five voices speaking simultaneously.

That could be transformative for everything from next-gen hearing aids or smart speakers dialing in and amplifying certain voices to future Zoom-style video conferencing learning to better prioritize speakers to stop everyone talking over each other.

“This is a supervised learning approach for speech separation,” Eliya Nachmani, a research assistant at FAIR (Facebook A.I. Research) Tel Aviv, told Digital Trends. “For the first time, we are showing that it’s possible to separate five separate speakers from a single microphone recording. We are also showing how the model can detect the number of the speakers in the recording and pre-form accordingly. The model is mask-free, meaning that we don’t estimate masking that removes other voices. Instead, our model learns to filter out the other voices or background noise.”

2 Speaker Voice Separation Animation FINAL

This “mask-free” element is significant. Previous models that achieved impressive benchmarks use a mask to remove other voices. The problem with this approach is that the models get worse as the number of speakers increases or is unknown. While Facebook’s model still requires the number of speakers to be specified, it uses some smart technology to automatically figure out the number of people who are talking and then select the most appropriate model to work with that number.

Nachmani pointed out that this speech separation technology could have other applications as well. In addition to separating voices, it could also sort other sounds from background noise. For instance, that could allow it to isolate different musical instruments from a single audio file.

Will any of this technology find its way into a Facebook product any time soon? That much is not clear. This is fundamental A.I. research which isn’t necessarily going to be baked into a future Facebook app. But it’s certainly easy to see how such a tool might be useful. Given that Facebook already offers various video and voice chat features, it’s not out of the realm of possibility that this could make its way into a future product sometime in the future.

This A.I. demonstration is just one of more than 30 papers Facebook is discussing at the International Conference on Machine Learning, which kicks off this weekend.

Editors' Recommendations